{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 探索性数据分析（EDA）\n",
    "\n",
    "\n",
    "有趣的数据集，包含球员和裁判之间的故事！\n",
    "\n",
    "\n",
    "数据集介绍点击： [here](https://osf.io/47tnc/).\n",
    "\n",
    "<img src=\"figures/f1.png\" alt=\"FAO\" width=\"290\" >"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "## 任务：\n",
    "\n",
    "探索性数据分析（EDA）. 挑战目标: **这些裁判在给红牌的时候咋想的呢，会不会被跟球员的肤色有关?**\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "%config InlineBackend.figure_format='retina'\n",
    "\n",
    "#from __future__ import absolute_import, division, print_function\n",
    "import matplotlib as mpl\n",
    "from matplotlib import pyplot as plt\n",
    "from matplotlib.pyplot import GridSpec\n",
    "import seaborn as sns\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import os, sys\n",
    "from tqdm import tqdm\n",
    "import warnings\n",
    "warnings.filterwarnings('ignore')\n",
    "sns.set_context(\"poster\", font_scale=1.3)\n",
    "\n",
    "#import missingno as msno\n",
    "#import pandas_profiling\n",
    "\n",
    "from sklearn.datasets import make_blobs\n",
    "import time"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 数据简介：\n",
    "\n",
    "> 数据包含球员和裁判的信息，2012-2013年的比赛数据，总共设计球员2053名，裁判3147名，特征列表如下：\n",
    "\n",
    "> -- https://docs.google.com/document/d/1uCF5wmbcL90qvrk_J27fWAvDcDNrO9o_APkicwRkOKc/edit\n",
    "\n",
    "\n",
    "| Variable Name: | Variable Description: | \n",
    "| -- | -- | \n",
    "| playerShort | short player ID | \n",
    "| player | player name | \n",
    "| club | player club | \n",
    "| leagueCountry | country of player club (England, Germany, France, and Spain) | \n",
    "| height | player height (in cm) | \n",
    "| weight | player weight (in kg) | \n",
    "| position | player position | \n",
    "| games | number of games in the player-referee dyad | \n",
    "| goals | number of goals in the player-referee dyad | \n",
    "| yellowCards | number of yellow cards player received from the referee | \n",
    "| yellowReds | number of yellow-red cards player received from the referee | \n",
    "| redCards | number of red cards player received from the referee | \n",
    "| photoID | ID of player photo (if available) | \n",
    "| rater1 | skin rating of photo by rater 1 | \n",
    "| rater2 | skin rating of photo by rater 2 | \n",
    "| refNum | unique referee ID number (referee name removed for anonymizing purposes) | \n",
    "| refCountry | unique referee country ID number | \n",
    "| meanIAT | mean implicit bias score (using the race IAT) for referee country | \n",
    "| nIAT | sample size for race IAT in that particular country | \n",
    "| seIAT | standard error for mean estimate of race IAT   | \n",
    "| meanExp | mean explicit bias score (using a racial thermometer task) for referee country | \n",
    "| nExp | sample size for explicit bias in that particular country | \n",
    "| seExp |  standard error for mean estimate of explicit bias measure | \n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Uncomment one of the following lines and run the cell:\n",
    "\n",
    "df = pd.read_csv(\"redcard.csv.gz\", compression='gzip')\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(146028, 28)"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>playerShort</th>\n",
       "      <th>player</th>\n",
       "      <th>club</th>\n",
       "      <th>leagueCountry</th>\n",
       "      <th>birthday</th>\n",
       "      <th>height</th>\n",
       "      <th>weight</th>\n",
       "      <th>position</th>\n",
       "      <th>games</th>\n",
       "      <th>victories</th>\n",
       "      <th>...</th>\n",
       "      <th>rater2</th>\n",
       "      <th>refNum</th>\n",
       "      <th>refCountry</th>\n",
       "      <th>Alpha_3</th>\n",
       "      <th>meanIAT</th>\n",
       "      <th>nIAT</th>\n",
       "      <th>seIAT</th>\n",
       "      <th>meanExp</th>\n",
       "      <th>nExp</th>\n",
       "      <th>seExp</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>lucas-wilchez</td>\n",
       "      <td>Lucas Wilchez</td>\n",
       "      <td>Real Zaragoza</td>\n",
       "      <td>Spain</td>\n",
       "      <td>31.08.1983</td>\n",
       "      <td>177.0</td>\n",
       "      <td>72.0</td>\n",
       "      <td>Attacking Midfielder</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.50</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>GRC</td>\n",
       "      <td>0.326391</td>\n",
       "      <td>712.0</td>\n",
       "      <td>0.000564</td>\n",
       "      <td>0.396000</td>\n",
       "      <td>750.0</td>\n",
       "      <td>0.002696</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>john-utaka</td>\n",
       "      <td>John Utaka</td>\n",
       "      <td>Montpellier HSC</td>\n",
       "      <td>France</td>\n",
       "      <td>08.01.1982</td>\n",
       "      <td>179.0</td>\n",
       "      <td>82.0</td>\n",
       "      <td>Right Winger</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.75</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>ZMB</td>\n",
       "      <td>0.203375</td>\n",
       "      <td>40.0</td>\n",
       "      <td>0.010875</td>\n",
       "      <td>-0.204082</td>\n",
       "      <td>49.0</td>\n",
       "      <td>0.061504</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>abdon-prats</td>\n",
       "      <td>Abdón Prats</td>\n",
       "      <td>RCD Mallorca</td>\n",
       "      <td>Spain</td>\n",
       "      <td>17.12.1992</td>\n",
       "      <td>181.0</td>\n",
       "      <td>79.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>ESP</td>\n",
       "      <td>0.369894</td>\n",
       "      <td>1785.0</td>\n",
       "      <td>0.000229</td>\n",
       "      <td>0.588297</td>\n",
       "      <td>1897.0</td>\n",
       "      <td>0.001002</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>pablo-mari</td>\n",
       "      <td>Pablo Marí</td>\n",
       "      <td>RCD Mallorca</td>\n",
       "      <td>Spain</td>\n",
       "      <td>31.08.1993</td>\n",
       "      <td>191.0</td>\n",
       "      <td>87.0</td>\n",
       "      <td>Center Back</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>ESP</td>\n",
       "      <td>0.369894</td>\n",
       "      <td>1785.0</td>\n",
       "      <td>0.000229</td>\n",
       "      <td>0.588297</td>\n",
       "      <td>1897.0</td>\n",
       "      <td>0.001002</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>ruben-pena</td>\n",
       "      <td>Rubén Peña</td>\n",
       "      <td>Real Valladolid</td>\n",
       "      <td>Spain</td>\n",
       "      <td>18.07.1991</td>\n",
       "      <td>172.0</td>\n",
       "      <td>70.0</td>\n",
       "      <td>Right Midfielder</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>ESP</td>\n",
       "      <td>0.369894</td>\n",
       "      <td>1785.0</td>\n",
       "      <td>0.000229</td>\n",
       "      <td>0.588297</td>\n",
       "      <td>1897.0</td>\n",
       "      <td>0.001002</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 28 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     playerShort         player             club leagueCountry    birthday  \\\n",
       "0  lucas-wilchez  Lucas Wilchez    Real Zaragoza         Spain  31.08.1983   \n",
       "1     john-utaka     John Utaka  Montpellier HSC        France  08.01.1982   \n",
       "2    abdon-prats    Abdón Prats     RCD Mallorca         Spain  17.12.1992   \n",
       "3     pablo-mari     Pablo Marí     RCD Mallorca         Spain  31.08.1993   \n",
       "4     ruben-pena     Rubén Peña  Real Valladolid         Spain  18.07.1991   \n",
       "\n",
       "   height  weight              position  games  victories  ...  rater2  \\\n",
       "0   177.0    72.0  Attacking Midfielder      1          0  ...    0.50   \n",
       "1   179.0    82.0          Right Winger      1          0  ...    0.75   \n",
       "2   181.0    79.0                   NaN      1          0  ...     NaN   \n",
       "3   191.0    87.0           Center Back      1          1  ...     NaN   \n",
       "4   172.0    70.0      Right Midfielder      1          1  ...     NaN   \n",
       "\n",
       "   refNum  refCountry  Alpha_3   meanIAT    nIAT     seIAT   meanExp    nExp  \\\n",
       "0       1           1      GRC  0.326391   712.0  0.000564  0.396000   750.0   \n",
       "1       2           2      ZMB  0.203375    40.0  0.010875 -0.204082    49.0   \n",
       "2       3           3      ESP  0.369894  1785.0  0.000229  0.588297  1897.0   \n",
       "3       3           3      ESP  0.369894  1785.0  0.000229  0.588297  1897.0   \n",
       "4       3           3      ESP  0.369894  1785.0  0.000229  0.588297  1897.0   \n",
       "\n",
       "      seExp  \n",
       "0  0.002696  \n",
       "1  0.061504  \n",
       "2  0.001002  \n",
       "3  0.001002  \n",
       "4  0.001002  \n",
       "\n",
       "[5 rows x 28 columns]"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>count</th>\n",
       "      <th>mean</th>\n",
       "      <th>std</th>\n",
       "      <th>min</th>\n",
       "      <th>25%</th>\n",
       "      <th>50%</th>\n",
       "      <th>75%</th>\n",
       "      <th>max</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>height</th>\n",
       "      <td>145765.0</td>\n",
       "      <td>181.935938</td>\n",
       "      <td>6.738726</td>\n",
       "      <td>1.610000e+02</td>\n",
       "      <td>177.000000</td>\n",
       "      <td>182.000000</td>\n",
       "      <td>187.000000</td>\n",
       "      <td>2.030000e+02</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>weight</th>\n",
       "      <td>143785.0</td>\n",
       "      <td>76.075662</td>\n",
       "      <td>7.140906</td>\n",
       "      <td>5.400000e+01</td>\n",
       "      <td>71.000000</td>\n",
       "      <td>76.000000</td>\n",
       "      <td>81.000000</td>\n",
       "      <td>1.000000e+02</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>games</th>\n",
       "      <td>146028.0</td>\n",
       "      <td>2.921166</td>\n",
       "      <td>3.413633</td>\n",
       "      <td>1.000000e+00</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>4.700000e+01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>victories</th>\n",
       "      <td>146028.0</td>\n",
       "      <td>1.278344</td>\n",
       "      <td>1.790725</td>\n",
       "      <td>0.000000e+00</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>2.900000e+01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ties</th>\n",
       "      <td>146028.0</td>\n",
       "      <td>0.708241</td>\n",
       "      <td>1.116793</td>\n",
       "      <td>0.000000e+00</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>1.400000e+01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>defeats</th>\n",
       "      <td>146028.0</td>\n",
       "      <td>0.934581</td>\n",
       "      <td>1.383059</td>\n",
       "      <td>0.000000e+00</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>1.800000e+01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>goals</th>\n",
       "      <td>146028.0</td>\n",
       "      <td>0.338058</td>\n",
       "      <td>0.906481</td>\n",
       "      <td>0.000000e+00</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>2.300000e+01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>yellowCards</th>\n",
       "      <td>146028.0</td>\n",
       "      <td>0.385364</td>\n",
       "      <td>0.795333</td>\n",
       "      <td>0.000000e+00</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>1.400000e+01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>yellowReds</th>\n",
       "      <td>146028.0</td>\n",
       "      <td>0.011381</td>\n",
       "      <td>0.107931</td>\n",
       "      <td>0.000000e+00</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>3.000000e+00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>redCards</th>\n",
       "      <td>146028.0</td>\n",
       "      <td>0.012559</td>\n",
       "      <td>0.112889</td>\n",
       "      <td>0.000000e+00</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>2.000000e+00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>rater1</th>\n",
       "      <td>124621.0</td>\n",
       "      <td>0.264255</td>\n",
       "      <td>0.295382</td>\n",
       "      <td>0.000000e+00</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.250000</td>\n",
       "      <td>0.250000</td>\n",
       "      <td>1.000000e+00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>rater2</th>\n",
       "      <td>124621.0</td>\n",
       "      <td>0.302862</td>\n",
       "      <td>0.293020</td>\n",
       "      <td>0.000000e+00</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.250000</td>\n",
       "      <td>0.500000</td>\n",
       "      <td>1.000000e+00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>refNum</th>\n",
       "      <td>146028.0</td>\n",
       "      <td>1534.827444</td>\n",
       "      <td>918.736625</td>\n",
       "      <td>1.000000e+00</td>\n",
       "      <td>641.000000</td>\n",
       "      <td>1604.000000</td>\n",
       "      <td>2345.000000</td>\n",
       "      <td>3.147000e+03</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>refCountry</th>\n",
       "      <td>146028.0</td>\n",
       "      <td>29.642842</td>\n",
       "      <td>27.496189</td>\n",
       "      <td>1.000000e+00</td>\n",
       "      <td>7.000000</td>\n",
       "      <td>21.000000</td>\n",
       "      <td>44.000000</td>\n",
       "      <td>1.610000e+02</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>meanIAT</th>\n",
       "      <td>145865.0</td>\n",
       "      <td>0.346276</td>\n",
       "      <td>0.032246</td>\n",
       "      <td>-4.725423e-02</td>\n",
       "      <td>0.334684</td>\n",
       "      <td>0.336628</td>\n",
       "      <td>0.369894</td>\n",
       "      <td>5.737933e-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>nIAT</th>\n",
       "      <td>145865.0</td>\n",
       "      <td>19697.411216</td>\n",
       "      <td>127126.197143</td>\n",
       "      <td>2.000000e+00</td>\n",
       "      <td>1785.000000</td>\n",
       "      <td>2882.000000</td>\n",
       "      <td>7749.000000</td>\n",
       "      <td>1.975803e+06</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>seIAT</th>\n",
       "      <td>145865.0</td>\n",
       "      <td>0.000631</td>\n",
       "      <td>0.004736</td>\n",
       "      <td>2.235373e-07</td>\n",
       "      <td>0.000055</td>\n",
       "      <td>0.000151</td>\n",
       "      <td>0.000229</td>\n",
       "      <td>2.862871e-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>meanExp</th>\n",
       "      <td>145865.0</td>\n",
       "      <td>0.452026</td>\n",
       "      <td>0.217469</td>\n",
       "      <td>-1.375000e+00</td>\n",
       "      <td>0.336101</td>\n",
       "      <td>0.356446</td>\n",
       "      <td>0.588297</td>\n",
       "      <td>1.800000e+00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>nExp</th>\n",
       "      <td>145865.0</td>\n",
       "      <td>20440.233860</td>\n",
       "      <td>130615.745103</td>\n",
       "      <td>2.000000e+00</td>\n",
       "      <td>1897.000000</td>\n",
       "      <td>3011.000000</td>\n",
       "      <td>7974.000000</td>\n",
       "      <td>2.029548e+06</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>seExp</th>\n",
       "      <td>145865.0</td>\n",
       "      <td>0.002994</td>\n",
       "      <td>0.019723</td>\n",
       "      <td>1.043334e-06</td>\n",
       "      <td>0.000225</td>\n",
       "      <td>0.000586</td>\n",
       "      <td>0.001002</td>\n",
       "      <td>1.060660e+00</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                count          mean            std           min          25%  \\\n",
       "height       145765.0    181.935938       6.738726  1.610000e+02   177.000000   \n",
       "weight       143785.0     76.075662       7.140906  5.400000e+01    71.000000   \n",
       "games        146028.0      2.921166       3.413633  1.000000e+00     1.000000   \n",
       "victories    146028.0      1.278344       1.790725  0.000000e+00     0.000000   \n",
       "ties         146028.0      0.708241       1.116793  0.000000e+00     0.000000   \n",
       "defeats      146028.0      0.934581       1.383059  0.000000e+00     0.000000   \n",
       "goals        146028.0      0.338058       0.906481  0.000000e+00     0.000000   \n",
       "yellowCards  146028.0      0.385364       0.795333  0.000000e+00     0.000000   \n",
       "yellowReds   146028.0      0.011381       0.107931  0.000000e+00     0.000000   \n",
       "redCards     146028.0      0.012559       0.112889  0.000000e+00     0.000000   \n",
       "rater1       124621.0      0.264255       0.295382  0.000000e+00     0.000000   \n",
       "rater2       124621.0      0.302862       0.293020  0.000000e+00     0.000000   \n",
       "refNum       146028.0   1534.827444     918.736625  1.000000e+00   641.000000   \n",
       "refCountry   146028.0     29.642842      27.496189  1.000000e+00     7.000000   \n",
       "meanIAT      145865.0      0.346276       0.032246 -4.725423e-02     0.334684   \n",
       "nIAT         145865.0  19697.411216  127126.197143  2.000000e+00  1785.000000   \n",
       "seIAT        145865.0      0.000631       0.004736  2.235373e-07     0.000055   \n",
       "meanExp      145865.0      0.452026       0.217469 -1.375000e+00     0.336101   \n",
       "nExp         145865.0  20440.233860  130615.745103  2.000000e+00  1897.000000   \n",
       "seExp        145865.0      0.002994       0.019723  1.043334e-06     0.000225   \n",
       "\n",
       "                     50%          75%           max  \n",
       "height        182.000000   187.000000  2.030000e+02  \n",
       "weight         76.000000    81.000000  1.000000e+02  \n",
       "games           2.000000     3.000000  4.700000e+01  \n",
       "victories       1.000000     2.000000  2.900000e+01  \n",
       "ties            0.000000     1.000000  1.400000e+01  \n",
       "defeats         1.000000     1.000000  1.800000e+01  \n",
       "goals           0.000000     0.000000  2.300000e+01  \n",
       "yellowCards     0.000000     1.000000  1.400000e+01  \n",
       "yellowReds      0.000000     0.000000  3.000000e+00  \n",
       "redCards        0.000000     0.000000  2.000000e+00  \n",
       "rater1          0.250000     0.250000  1.000000e+00  \n",
       "rater2          0.250000     0.500000  1.000000e+00  \n",
       "refNum       1604.000000  2345.000000  3.147000e+03  \n",
       "refCountry     21.000000    44.000000  1.610000e+02  \n",
       "meanIAT         0.336628     0.369894  5.737933e-01  \n",
       "nIAT         2882.000000  7749.000000  1.975803e+06  \n",
       "seIAT           0.000151     0.000229  2.862871e-01  \n",
       "meanExp         0.356446     0.588297  1.800000e+00  \n",
       "nExp         3011.000000  7974.000000  2.029548e+06  \n",
       "seExp           0.000586     0.001002  1.060660e+00  "
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.describe().T"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "playerShort       object\n",
       "player            object\n",
       "club              object\n",
       "leagueCountry     object\n",
       "birthday          object\n",
       "height           float64\n",
       "weight           float64\n",
       "position          object\n",
       "games              int64\n",
       "victories          int64\n",
       "ties               int64\n",
       "defeats            int64\n",
       "goals              int64\n",
       "yellowCards        int64\n",
       "yellowReds         int64\n",
       "redCards           int64\n",
       "photoID           object\n",
       "rater1           float64\n",
       "rater2           float64\n",
       "refNum             int64\n",
       "refCountry         int64\n",
       "Alpha_3           object\n",
       "meanIAT          float64\n",
       "nIAT             float64\n",
       "seIAT            float64\n",
       "meanExp          float64\n",
       "nExp             float64\n",
       "seExp            float64\n",
       "dtype: object"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.dtypes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['playerShort',\n",
       " 'player',\n",
       " 'club',\n",
       " 'leagueCountry',\n",
       " 'birthday',\n",
       " 'height',\n",
       " 'weight',\n",
       " 'position',\n",
       " 'games',\n",
       " 'victories',\n",
       " 'ties',\n",
       " 'defeats',\n",
       " 'goals',\n",
       " 'yellowCards',\n",
       " 'yellowReds',\n",
       " 'redCards',\n",
       " 'photoID',\n",
       " 'rater1',\n",
       " 'rater2',\n",
       " 'refNum',\n",
       " 'refCountry',\n",
       " 'Alpha_3',\n",
       " 'meanIAT',\n",
       " 'nIAT',\n",
       " 'seIAT',\n",
       " 'meanExp',\n",
       " 'nExp',\n",
       " 'seExp']"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "all_columns = df.columns.tolist()\n",
    "all_columns"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Challenge\n",
    "\n",
    "Before looking below, try to answer some high level questions about the dataset. \n",
    "\n",
    "\n",
    "How do we operationalize the question of referees giving more red cards to dark skinned players?\n",
    "* Counterfactual: if the player were lighter, a ref is more likely to have given a yellow or no card **for the same offense under the same conditions**\n",
    "* Regression: accounting for confounding, darker players have positive coefficient on regression against proportion red/total card\n",
    "\n",
    "Potential issues\n",
    "* How to combine rater1 and rater2? Average them? What if they disagree? Throw it out?\n",
    "* Is data imbalanced, i.e. red cards are very rare?\n",
    "* Is data biased, i.e. players have different amounts of play time? Is this a summary of their whole career?\n",
    "* How do I know I've accounted for all forms of confounding?\n",
    "\n",
    "**First, is there systematic discrimination across all refs?**\n",
    "\n",
    "Exploration/hypotheses:\n",
    "* Distribution of games played\n",
    "* red cards vs games played\n",
    "* Reds per game played vs total cards per game played by skin color\n",
    "* Distribution of # red, # yellow, total cards, and fraction red per game played for all players by avg skin color\n",
    "* How many refs did players encounter?\n",
    "* Do some clubs play more aggresively and get carded more? Or are more reserved and get less?\n",
    "* Does carding vary by leagueCountry?\n",
    "* Do high scorers get more slack (fewer cards) for the same position?\n",
    "* Are there some referees that give more red/yellow cards than others?\n",
    "* how consistent are raters? Check with Cohen's kappa.\n",
    "* how do red cards vary by position? e.g. defenders get more?\n",
    "* Do players with more games get more cards, and is there difference across skin color?\n",
    "* indication of bias depending on refCountry?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Understand how the data's organized\n",
    "\n",
    "The dataset is a single csv where it aggregated every interaction between referee and player into a single row. In other words: Referee A refereed Player B in, say, 10 games, and gave 2 redcards during those 10 games. Then there would be a unique row in the dataset that said: \n",
    "\n",
    "    Referee A, Player B, 2 redcards, ... \n",
    "\n",
    "This has several implications that make this first step to understanding and dealing with this data a bit tricky. First, is that the information about Player B is repeated each time -- meaning if we did a simple average of some metric of we would likely get a misleading result. \n",
    "\n",
    "For example, asking \"what is the average `weight` of the players?\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "181.93593798236887"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df['height'].mean()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "181.93593798236887"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df['height'].mean()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "playerShort\n",
       "aaron-hughes              182.0\n",
       "aaron-hunt                183.0\n",
       "aaron-lennon              165.0\n",
       "aaron-ramsey              178.0\n",
       "abdelhamid-el-kaoutari    180.0\n",
       "abdon-prats               181.0\n",
       "abdou-dampha              184.0\n",
       "abdou-traore_2            180.0\n",
       "abdoul-camara             177.0\n",
       "abdoulaye-diallo_2        189.0\n",
       "abdoulaye-diallo_3        182.0\n",
       "abdoulaye-keita_2         188.0\n",
       "abdoulaye-sane            184.0\n",
       "abdoulwhaid-sissoko       180.0\n",
       "abdul-rahman-baba         179.0\n",
       "abdul-razak               180.0\n",
       "abel-aguilar              185.0\n",
       "abel-khaled               179.0\n",
       "abelaziz-barrada          185.0\n",
       "abou-diaby                188.0\n",
       "aboulaye-keita            175.0\n",
       "adam-bodzek               184.0\n",
       "adam-campbell             168.0\n",
       "adam-federici             188.0\n",
       "adam-hlousek              188.0\n",
       "adam-johnson              175.0\n",
       "adam-lallana              173.0\n",
       "adam-le-fondre            180.0\n",
       "adam-morgan               179.0\n",
       "adam-pinter               190.0\n",
       "                          ...  \n",
       "yaya-toure                191.0\n",
       "yoan-gouffran             175.0\n",
       "yoann-gourcuff            185.0\n",
       "yohan-cabaye              175.0\n",
       "yohan-mollo               175.0\n",
       "yohandry-orozco           164.0\n",
       "yohann-poulard            190.0\n",
       "yohann-thuram-ulien       187.0\n",
       "yossi-benayoun            178.0\n",
       "younes-belhanda           174.0\n",
       "younes-kaboul             190.0\n",
       "youssef-adnane            178.0\n",
       "youssef-el-arabi          180.0\n",
       "youssuf-mulumbu           177.0\n",
       "yunus-malli               179.0\n",
       "zakarie-labidi            178.0\n",
       "zdenk-pospch              174.0\n",
       "zdravko-kuzmanovic        186.0\n",
       "ze-castro                 183.0\n",
       "zhi-gin-lam               175.0\n",
       "ziri-hammar               179.0\n",
       "zlatan-alomerovic         187.0\n",
       "zlatan-ibrahimovic        192.0\n",
       "zlatko-junuzovic          172.0\n",
       "zoltan-gera               181.0\n",
       "zoltan-stieber            175.0\n",
       "zouheir-dhaouadi          180.0\n",
       "zoumana-camara            182.0\n",
       "zubikarai                 185.0\n",
       "zurutuza                  186.0\n",
       "Name: height, Length: 2053, dtype: float64"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby('playerShort').height.mean()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "181.74372848007872"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.mean(df.groupby('playerShort').height.mean())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Tidy Data\n",
    "\n",
    "Hadley Wickham's concept of a **tidy dataset** summarized as:\n",
    "\n",
    ">  - Each variable forms a column\n",
    ">  - Each observation forms a row\n",
    ">  - Each type of observational unit forms a table\n",
    "\n",
    "A longer paper describing this can be found in this [pdf](https://www.jstatsoft.org/article/view/v059i10/v59i10.pdf).\n",
    "\n",
    "Having datasets in this form allows for much simpler analyses. So the first step is to try and clean up the dataset into a tidy dataset. \n",
    "\n",
    "The first step that I am going to take is to break up the dataset into the different observational units. By that I'm going to have separate tables (or dataframes) for: \n",
    "\n",
    " - players\n",
    " - clubs\n",
    " - referees\n",
    " - countries\n",
    " - dyads"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create Tidy Players Table"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>key1</th>\n",
       "      <th>key2</th>\n",
       "      <th>data1</th>\n",
       "      <th>data2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>a</td>\n",
       "      <td>one</td>\n",
       "      <td>-0.071263</td>\n",
       "      <td>-0.734116</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>a</td>\n",
       "      <td>two</td>\n",
       "      <td>0.882403</td>\n",
       "      <td>-1.598095</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>b</td>\n",
       "      <td>one</td>\n",
       "      <td>-0.031270</td>\n",
       "      <td>-1.497234</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>b</td>\n",
       "      <td>two</td>\n",
       "      <td>-1.118872</td>\n",
       "      <td>-0.206794</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>a</td>\n",
       "      <td>one</td>\n",
       "      <td>2.449043</td>\n",
       "      <td>1.059959</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  key1 key2     data1     data2\n",
       "0    a  one -0.071263 -0.734116\n",
       "1    a  two  0.882403 -1.598095\n",
       "2    b  one -0.031270 -1.497234\n",
       "3    b  two -1.118872 -0.206794\n",
       "4    a  one  2.449043  1.059959"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2 = pd.DataFrame({'key1':['a', 'a', 'b', 'b', 'a'],\n",
    "     'key2':['one', 'two', 'one', 'two', 'one'],\n",
    "     'data1':np.random.randn(5),\n",
    "     'data2':np.random.randn(5)})\n",
    "df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "ename": "KeyError",
     "evalue": "'key1'",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mKeyError\u001b[0m                                  Traceback (most recent call last)",
      "\u001b[1;32mD:\\ProgramData\\Anaconda3\\lib\\site-packages\\pandas\\core\\indexes\\base.py\u001b[0m in \u001b[0;36mget_loc\u001b[1;34m(self, key, method, tolerance)\u001b[0m\n\u001b[0;32m   2656\u001b[0m             \u001b[1;32mtry\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 2657\u001b[1;33m                 \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_engine\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mget_loc\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   2658\u001b[0m             \u001b[1;32mexcept\u001b[0m \u001b[0mKeyError\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mpandas/_libs/index.pyx\u001b[0m in \u001b[0;36mpandas._libs.index.IndexEngine.get_loc\u001b[1;34m()\u001b[0m\n",
      "\u001b[1;32mpandas/_libs/index.pyx\u001b[0m in \u001b[0;36mpandas._libs.index.IndexEngine.get_loc\u001b[1;34m()\u001b[0m\n",
      "\u001b[1;32mpandas/_libs/hashtable_class_helper.pxi\u001b[0m in \u001b[0;36mpandas._libs.hashtable.PyObjectHashTable.get_item\u001b[1;34m()\u001b[0m\n",
      "\u001b[1;32mpandas/_libs/hashtable_class_helper.pxi\u001b[0m in \u001b[0;36mpandas._libs.hashtable.PyObjectHashTable.get_item\u001b[1;34m()\u001b[0m\n",
      "\u001b[1;31mKeyError\u001b[0m: 'key1'",
      "\nDuring handling of the above exception, another exception occurred:\n",
      "\u001b[1;31mKeyError\u001b[0m                                  Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-18-0bf39858d218>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mgrouped\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mdf2\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;34m'data1'\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mgroupby\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdf\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;34m'key1'\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m      2\u001b[0m \u001b[0mgrouped\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mmean\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mD:\\ProgramData\\Anaconda3\\lib\\site-packages\\pandas\\core\\frame.py\u001b[0m in \u001b[0;36m__getitem__\u001b[1;34m(self, key)\u001b[0m\n\u001b[0;32m   2925\u001b[0m             \u001b[1;32mif\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mcolumns\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mnlevels\u001b[0m \u001b[1;33m>\u001b[0m \u001b[1;36m1\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2926\u001b[0m                 \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_getitem_multilevel\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 2927\u001b[1;33m             \u001b[0mindexer\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mcolumns\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mget_loc\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   2928\u001b[0m             \u001b[1;32mif\u001b[0m \u001b[0mis_integer\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mindexer\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2929\u001b[0m                 \u001b[0mindexer\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;33m[\u001b[0m\u001b[0mindexer\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mD:\\ProgramData\\Anaconda3\\lib\\site-packages\\pandas\\core\\indexes\\base.py\u001b[0m in \u001b[0;36mget_loc\u001b[1;34m(self, key, method, tolerance)\u001b[0m\n\u001b[0;32m   2657\u001b[0m                 \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_engine\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mget_loc\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2658\u001b[0m             \u001b[1;32mexcept\u001b[0m \u001b[0mKeyError\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 2659\u001b[1;33m                 \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_engine\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mget_loc\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_maybe_cast_indexer\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m   2660\u001b[0m         \u001b[0mindexer\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mget_indexer\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mmethod\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mmethod\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mtolerance\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mtolerance\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   2661\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[0mindexer\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mndim\u001b[0m \u001b[1;33m>\u001b[0m \u001b[1;36m1\u001b[0m \u001b[1;32mor\u001b[0m \u001b[0mindexer\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msize\u001b[0m \u001b[1;33m>\u001b[0m \u001b[1;36m1\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mpandas/_libs/index.pyx\u001b[0m in \u001b[0;36mpandas._libs.index.IndexEngine.get_loc\u001b[1;34m()\u001b[0m\n",
      "\u001b[1;32mpandas/_libs/index.pyx\u001b[0m in \u001b[0;36mpandas._libs.index.IndexEngine.get_loc\u001b[1;34m()\u001b[0m\n",
      "\u001b[1;32mpandas/_libs/hashtable_class_helper.pxi\u001b[0m in \u001b[0;36mpandas._libs.hashtable.PyObjectHashTable.get_item\u001b[1;34m()\u001b[0m\n",
      "\u001b[1;32mpandas/_libs/hashtable_class_helper.pxi\u001b[0m in \u001b[0;36mpandas._libs.hashtable.PyObjectHashTable.get_item\u001b[1;34m()\u001b[0m\n",
      "\u001b[1;31mKeyError\u001b[0m: 'key1'"
     ]
    }
   ],
   "source": [
    "grouped = df2['data1'].groupby(df['key1'])\n",
    "grouped.mean()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "player_index = 'playerShort'\n",
    "player_cols = [#'player', # drop player name, we have unique identifier\n",
    "               'birthday',\n",
    "               'height',\n",
    "               'weight',\n",
    "               'position',\n",
    "               'photoID',\n",
    "               'rater1',\n",
    "               'rater2',\n",
    "              ]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>playerShort</th>\n",
       "      <th>player</th>\n",
       "      <th>club</th>\n",
       "      <th>leagueCountry</th>\n",
       "      <th>birthday</th>\n",
       "      <th>height</th>\n",
       "      <th>weight</th>\n",
       "      <th>position</th>\n",
       "      <th>games</th>\n",
       "      <th>victories</th>\n",
       "      <th>...</th>\n",
       "      <th>rater2</th>\n",
       "      <th>refNum</th>\n",
       "      <th>refCountry</th>\n",
       "      <th>Alpha_3</th>\n",
       "      <th>meanIAT</th>\n",
       "      <th>nIAT</th>\n",
       "      <th>seIAT</th>\n",
       "      <th>meanExp</th>\n",
       "      <th>nExp</th>\n",
       "      <th>seExp</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>lucas-wilchez</td>\n",
       "      <td>Lucas Wilchez</td>\n",
       "      <td>Real Zaragoza</td>\n",
       "      <td>Spain</td>\n",
       "      <td>31.08.1983</td>\n",
       "      <td>177.0</td>\n",
       "      <td>72.0</td>\n",
       "      <td>Attacking Midfielder</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.50</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>GRC</td>\n",
       "      <td>0.326391</td>\n",
       "      <td>712.0</td>\n",
       "      <td>0.000564</td>\n",
       "      <td>0.396000</td>\n",
       "      <td>750.0</td>\n",
       "      <td>0.002696</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>john-utaka</td>\n",
       "      <td>John Utaka</td>\n",
       "      <td>Montpellier HSC</td>\n",
       "      <td>France</td>\n",
       "      <td>08.01.1982</td>\n",
       "      <td>179.0</td>\n",
       "      <td>82.0</td>\n",
       "      <td>Right Winger</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.75</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>ZMB</td>\n",
       "      <td>0.203375</td>\n",
       "      <td>40.0</td>\n",
       "      <td>0.010875</td>\n",
       "      <td>-0.204082</td>\n",
       "      <td>49.0</td>\n",
       "      <td>0.061504</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>abdon-prats</td>\n",
       "      <td>Abdón Prats</td>\n",
       "      <td>RCD Mallorca</td>\n",
       "      <td>Spain</td>\n",
       "      <td>17.12.1992</td>\n",
       "      <td>181.0</td>\n",
       "      <td>79.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>ESP</td>\n",
       "      <td>0.369894</td>\n",
       "      <td>1785.0</td>\n",
       "      <td>0.000229</td>\n",
       "      <td>0.588297</td>\n",
       "      <td>1897.0</td>\n",
       "      <td>0.001002</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>pablo-mari</td>\n",
       "      <td>Pablo Marí</td>\n",
       "      <td>RCD Mallorca</td>\n",
       "      <td>Spain</td>\n",
       "      <td>31.08.1993</td>\n",
       "      <td>191.0</td>\n",
       "      <td>87.0</td>\n",
       "      <td>Center Back</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>ESP</td>\n",
       "      <td>0.369894</td>\n",
       "      <td>1785.0</td>\n",
       "      <td>0.000229</td>\n",
       "      <td>0.588297</td>\n",
       "      <td>1897.0</td>\n",
       "      <td>0.001002</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>ruben-pena</td>\n",
       "      <td>Rubén Peña</td>\n",
       "      <td>Real Valladolid</td>\n",
       "      <td>Spain</td>\n",
       "      <td>18.07.1991</td>\n",
       "      <td>172.0</td>\n",
       "      <td>70.0</td>\n",
       "      <td>Right Midfielder</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>ESP</td>\n",
       "      <td>0.369894</td>\n",
       "      <td>1785.0</td>\n",
       "      <td>0.000229</td>\n",
       "      <td>0.588297</td>\n",
       "      <td>1897.0</td>\n",
       "      <td>0.001002</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>aaron-hughes</td>\n",
       "      <td>Aaron Hughes</td>\n",
       "      <td>Fulham FC</td>\n",
       "      <td>England</td>\n",
       "      <td>08.11.1979</td>\n",
       "      <td>182.0</td>\n",
       "      <td>71.0</td>\n",
       "      <td>Center Back</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.00</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>aleksandar-kolarov</td>\n",
       "      <td>Aleksandar Kolarov</td>\n",
       "      <td>Manchester City</td>\n",
       "      <td>England</td>\n",
       "      <td>10.11.1985</td>\n",
       "      <td>187.0</td>\n",
       "      <td>80.0</td>\n",
       "      <td>Left Fullback</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>0.25</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>alexander-tettey</td>\n",
       "      <td>Alexander Tettey</td>\n",
       "      <td>Norwich City</td>\n",
       "      <td>England</td>\n",
       "      <td>04.04.1986</td>\n",
       "      <td>180.0</td>\n",
       "      <td>68.0</td>\n",
       "      <td>Defensive Midfielder</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.00</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>anders-lindegaard</td>\n",
       "      <td>Anders Lindegaard</td>\n",
       "      <td>Manchester United</td>\n",
       "      <td>England</td>\n",
       "      <td>13.04.1984</td>\n",
       "      <td>193.0</td>\n",
       "      <td>80.0</td>\n",
       "      <td>Goalkeeper</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.25</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>andreas-beck</td>\n",
       "      <td>Andreas Beck</td>\n",
       "      <td>1899 Hoffenheim</td>\n",
       "      <td>Germany</td>\n",
       "      <td>13.03.1987</td>\n",
       "      <td>180.0</td>\n",
       "      <td>70.0</td>\n",
       "      <td>Right Fullback</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>0.00</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>antonio-rukavina</td>\n",
       "      <td>Antonio Rukavina</td>\n",
       "      <td>Real Valladolid</td>\n",
       "      <td>Spain</td>\n",
       "      <td>26.01.1984</td>\n",
       "      <td>177.0</td>\n",
       "      <td>74.0</td>\n",
       "      <td>Right Fullback</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>...</td>\n",
       "      <td>0.00</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>ashkan-dejagah</td>\n",
       "      <td>Ashkan Dejagah</td>\n",
       "      <td>Fulham FC</td>\n",
       "      <td>England</td>\n",
       "      <td>05.07.1986</td>\n",
       "      <td>181.0</td>\n",
       "      <td>74.0</td>\n",
       "      <td>Left Winger</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>0.50</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>benedikt-hoewedes</td>\n",
       "      <td>Benedikt Höwedes</td>\n",
       "      <td>FC Schalke 04</td>\n",
       "      <td>Germany</td>\n",
       "      <td>29.02.1988</td>\n",
       "      <td>187.0</td>\n",
       "      <td>80.0</td>\n",
       "      <td>Center Back</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>0.00</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>chris-baird</td>\n",
       "      <td>Chris Baird</td>\n",
       "      <td>Fulham FC</td>\n",
       "      <td>England</td>\n",
       "      <td>25.02.1982</td>\n",
       "      <td>186.0</td>\n",
       "      <td>77.0</td>\n",
       "      <td>Defensive Midfielder</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.00</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>chris-brunt</td>\n",
       "      <td>Chris Brunt</td>\n",
       "      <td>West Bromwich Albion</td>\n",
       "      <td>England</td>\n",
       "      <td>14.12.1984</td>\n",
       "      <td>185.0</td>\n",
       "      <td>74.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.25</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>daniel-schwaab</td>\n",
       "      <td>Daniel Schwaab</td>\n",
       "      <td>Bayer Leverkusen</td>\n",
       "      <td>Germany</td>\n",
       "      <td>23.08.1988</td>\n",
       "      <td>186.0</td>\n",
       "      <td>76.0</td>\n",
       "      <td>Right Fullback</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>0.00</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>dennis-aogo</td>\n",
       "      <td>Dennis Aogo</td>\n",
       "      <td>Hamburger SV</td>\n",
       "      <td>Germany</td>\n",
       "      <td>14.01.1987</td>\n",
       "      <td>184.0</td>\n",
       "      <td>85.0</td>\n",
       "      <td>Left Fullback</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>0.50</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>george-mccartney</td>\n",
       "      <td>George McCartney</td>\n",
       "      <td>West Ham United</td>\n",
       "      <td>England</td>\n",
       "      <td>29.04.1981</td>\n",
       "      <td>180.0</td>\n",
       "      <td>74.0</td>\n",
       "      <td>Left Fullback</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.00</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>gylfi-sigurdsson</td>\n",
       "      <td>Gylfi Sigurðsson</td>\n",
       "      <td>Tottenham Hotspur</td>\n",
       "      <td>England</td>\n",
       "      <td>08.09.1989</td>\n",
       "      <td>186.0</td>\n",
       "      <td>77.0</td>\n",
       "      <td>Attacking Midfielder</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.00</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>ivan-obradovic</td>\n",
       "      <td>Ivan Obradović</td>\n",
       "      <td>Real Zaragoza</td>\n",
       "      <td>Spain</td>\n",
       "      <td>25.07.1988</td>\n",
       "      <td>181.0</td>\n",
       "      <td>74.0</td>\n",
       "      <td>Left Fullback</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>0.25</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>jan-moravek</td>\n",
       "      <td>Jan Morávek</td>\n",
       "      <td>FC Augsburg</td>\n",
       "      <td>Germany</td>\n",
       "      <td>01.11.1989</td>\n",
       "      <td>180.0</td>\n",
       "      <td>75.0</td>\n",
       "      <td>Attacking Midfielder</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.25</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>jan-rosenthal</td>\n",
       "      <td>Jan Rosenthal</td>\n",
       "      <td>SC Freiburg</td>\n",
       "      <td>Germany</td>\n",
       "      <td>07.04.1986</td>\n",
       "      <td>186.0</td>\n",
       "      <td>76.0</td>\n",
       "      <td>Attacking Midfielder</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>0.00</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>jonny-evans</td>\n",
       "      <td>Jonny Evans</td>\n",
       "      <td>Manchester United</td>\n",
       "      <td>England</td>\n",
       "      <td>02.01.1988</td>\n",
       "      <td>188.0</td>\n",
       "      <td>77.0</td>\n",
       "      <td>Center Back</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.00</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>kyriakos-papadopoulos</td>\n",
       "      <td>Kyriakos Papadopoulos</td>\n",
       "      <td>FC Schalke 04</td>\n",
       "      <td>Germany</td>\n",
       "      <td>23.02.1992</td>\n",
       "      <td>183.0</td>\n",
       "      <td>85.0</td>\n",
       "      <td>Center Back</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.00</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>marko-marin</td>\n",
       "      <td>Marko Marin</td>\n",
       "      <td>Chelsea FC</td>\n",
       "      <td>England</td>\n",
       "      <td>13.03.1989</td>\n",
       "      <td>170.0</td>\n",
       "      <td>63.0</td>\n",
       "      <td>Attacking Midfielder</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>0.00</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>mats-hummels</td>\n",
       "      <td>Mats Hummels</td>\n",
       "      <td>Borussia Dortmund</td>\n",
       "      <td>Germany</td>\n",
       "      <td>16.12.1988</td>\n",
       "      <td>192.0</td>\n",
       "      <td>90.0</td>\n",
       "      <td>Center Back</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>0.25</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>mesut-oezil</td>\n",
       "      <td>Mesut Özil</td>\n",
       "      <td>Real Madrid</td>\n",
       "      <td>Spain</td>\n",
       "      <td>15.10.1988</td>\n",
       "      <td>183.0</td>\n",
       "      <td>76.0</td>\n",
       "      <td>Attacking Midfielder</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>0.25</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>milorad-pekovic</td>\n",
       "      <td>Milorad Peković</td>\n",
       "      <td>SpVgg Greuther Fürth</td>\n",
       "      <td>Germany</td>\n",
       "      <td>05.08.1977</td>\n",
       "      <td>189.0</td>\n",
       "      <td>88.0</td>\n",
       "      <td>Defensive Midfielder</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>0.25</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>nemanja-vidic</td>\n",
       "      <td>Nemanja Vidić</td>\n",
       "      <td>Manchester United</td>\n",
       "      <td>England</td>\n",
       "      <td>21.10.1981</td>\n",
       "      <td>188.0</td>\n",
       "      <td>82.0</td>\n",
       "      <td>Center Back</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>...</td>\n",
       "      <td>0.00</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>neven-subotic</td>\n",
       "      <td>Neven Subotić</td>\n",
       "      <td>Borussia Dortmund</td>\n",
       "      <td>Germany</td>\n",
       "      <td>10.12.1988</td>\n",
       "      <td>193.0</td>\n",
       "      <td>88.0</td>\n",
       "      <td>Center Back</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>...</td>\n",
       "      <td>0.25</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>130.0</td>\n",
       "      <td>0.013752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>117353</th>\n",
       "      <td>dylan-tombides</td>\n",
       "      <td>Dylan Tombides</td>\n",
       "      <td>West Ham United</td>\n",
       "      <td>England</td>\n",
       "      <td>08.03.1994</td>\n",
       "      <td>180.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Center Forward</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2481</td>\n",
       "      <td>100</td>\n",
       "      <td>JAM</td>\n",
       "      <td>0.109661</td>\n",
       "      <td>517.0</td>\n",
       "      <td>0.000881</td>\n",
       "      <td>-1.076350</td>\n",
       "      <td>537.0</td>\n",
       "      <td>0.004087</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>118341</th>\n",
       "      <td>dominic-samuel</td>\n",
       "      <td>Dominic Samuel</td>\n",
       "      <td>Reading FC</td>\n",
       "      <td>England</td>\n",
       "      <td>01.04.1994</td>\n",
       "      <td>182.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2517</td>\n",
       "      <td>44</td>\n",
       "      <td>ENGL</td>\n",
       "      <td>0.326690</td>\n",
       "      <td>44791.0</td>\n",
       "      <td>0.000010</td>\n",
       "      <td>0.356446</td>\n",
       "      <td>46916.0</td>\n",
       "      <td>0.000037</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>120273</th>\n",
       "      <td>jonathan-santiago</td>\n",
       "      <td>Jonathan Santiago</td>\n",
       "      <td>Olympique Marseille</td>\n",
       "      <td>France</td>\n",
       "      <td>09.06.1994</td>\n",
       "      <td>167.0</td>\n",
       "      <td>59.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2546</td>\n",
       "      <td>57</td>\n",
       "      <td>AUT</td>\n",
       "      <td>0.337539</td>\n",
       "      <td>1319.0</td>\n",
       "      <td>0.000331</td>\n",
       "      <td>0.394139</td>\n",
       "      <td>1365.0</td>\n",
       "      <td>0.001717</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>120280</th>\n",
       "      <td>kevin-osei</td>\n",
       "      <td>Kevin Osei</td>\n",
       "      <td>Olympique Marseille</td>\n",
       "      <td>France</td>\n",
       "      <td>26.03.1991</td>\n",
       "      <td>173.0</td>\n",
       "      <td>71.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.75</td>\n",
       "      <td>2546</td>\n",
       "      <td>57</td>\n",
       "      <td>AUT</td>\n",
       "      <td>0.337539</td>\n",
       "      <td>1319.0</td>\n",
       "      <td>0.000331</td>\n",
       "      <td>0.394139</td>\n",
       "      <td>1365.0</td>\n",
       "      <td>0.001717</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>120377</th>\n",
       "      <td>wesley-jobello</td>\n",
       "      <td>Wesley Jobello</td>\n",
       "      <td>Olympique Marseille</td>\n",
       "      <td>France</td>\n",
       "      <td>23.01.1994</td>\n",
       "      <td>179.0</td>\n",
       "      <td>68.0</td>\n",
       "      <td>Left Winger</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.75</td>\n",
       "      <td>2546</td>\n",
       "      <td>57</td>\n",
       "      <td>AUT</td>\n",
       "      <td>0.337539</td>\n",
       "      <td>1319.0</td>\n",
       "      <td>0.000331</td>\n",
       "      <td>0.394139</td>\n",
       "      <td>1365.0</td>\n",
       "      <td>0.001717</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>121494</th>\n",
       "      <td>stephen-sama</td>\n",
       "      <td>Stephen Sama</td>\n",
       "      <td>Liverpool FC</td>\n",
       "      <td>England</td>\n",
       "      <td>05.03.1993</td>\n",
       "      <td>188.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2572</td>\n",
       "      <td>64</td>\n",
       "      <td>NLD</td>\n",
       "      <td>0.352920</td>\n",
       "      <td>5952.0</td>\n",
       "      <td>0.000070</td>\n",
       "      <td>0.445679</td>\n",
       "      <td>6121.0</td>\n",
       "      <td>0.000269</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>121798</th>\n",
       "      <td>adrien-thomasson</td>\n",
       "      <td>Adrien Thomasson</td>\n",
       "      <td>Évian Thonon Gaillard</td>\n",
       "      <td>France</td>\n",
       "      <td>10.12.1993</td>\n",
       "      <td>182.0</td>\n",
       "      <td>75.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2598</td>\n",
       "      <td>7</td>\n",
       "      <td>FRA</td>\n",
       "      <td>0.334684</td>\n",
       "      <td>2882.0</td>\n",
       "      <td>0.000151</td>\n",
       "      <td>0.336101</td>\n",
       "      <td>3011.0</td>\n",
       "      <td>0.000586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>121911</th>\n",
       "      <td>diacko-fofana</td>\n",
       "      <td>Diacko Fofana</td>\n",
       "      <td>OGC Nice</td>\n",
       "      <td>France</td>\n",
       "      <td>29.07.1994</td>\n",
       "      <td>175.0</td>\n",
       "      <td>71.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2598</td>\n",
       "      <td>7</td>\n",
       "      <td>FRA</td>\n",
       "      <td>0.334684</td>\n",
       "      <td>2882.0</td>\n",
       "      <td>0.000151</td>\n",
       "      <td>0.336101</td>\n",
       "      <td>3011.0</td>\n",
       "      <td>0.000586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>121939</th>\n",
       "      <td>fabien-dao-castellana</td>\n",
       "      <td>Fabien Dao Castellana</td>\n",
       "      <td>OGC Nice</td>\n",
       "      <td>France</td>\n",
       "      <td>28.07.1993</td>\n",
       "      <td>169.0</td>\n",
       "      <td>61.0</td>\n",
       "      <td>Defensive Midfielder</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2598</td>\n",
       "      <td>7</td>\n",
       "      <td>FRA</td>\n",
       "      <td>0.334684</td>\n",
       "      <td>2882.0</td>\n",
       "      <td>0.000151</td>\n",
       "      <td>0.336101</td>\n",
       "      <td>3011.0</td>\n",
       "      <td>0.000586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>122216</th>\n",
       "      <td>petrus-boumal</td>\n",
       "      <td>Petrus Boumal</td>\n",
       "      <td>FC Sochaux</td>\n",
       "      <td>France</td>\n",
       "      <td>20.04.1993</td>\n",
       "      <td>175.0</td>\n",
       "      <td>75.0</td>\n",
       "      <td>Center Midfielder</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2598</td>\n",
       "      <td>7</td>\n",
       "      <td>FRA</td>\n",
       "      <td>0.334684</td>\n",
       "      <td>2882.0</td>\n",
       "      <td>0.000151</td>\n",
       "      <td>0.336101</td>\n",
       "      <td>3011.0</td>\n",
       "      <td>0.000586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>122915</th>\n",
       "      <td>cyril-hennion</td>\n",
       "      <td>Cyril Hennion</td>\n",
       "      <td>OGC Nice</td>\n",
       "      <td>France</td>\n",
       "      <td>03.01.1992</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2619</td>\n",
       "      <td>7</td>\n",
       "      <td>FRA</td>\n",
       "      <td>0.334684</td>\n",
       "      <td>2882.0</td>\n",
       "      <td>0.000151</td>\n",
       "      <td>0.336101</td>\n",
       "      <td>3011.0</td>\n",
       "      <td>0.000586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>123250</th>\n",
       "      <td>pape-coulibaly</td>\n",
       "      <td>Pape Coulibaly</td>\n",
       "      <td>AS Saint-Étienne</td>\n",
       "      <td>France</td>\n",
       "      <td>02.03.1988</td>\n",
       "      <td>194.0</td>\n",
       "      <td>92.0</td>\n",
       "      <td>Goalkeeper</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2619</td>\n",
       "      <td>7</td>\n",
       "      <td>FRA</td>\n",
       "      <td>0.334684</td>\n",
       "      <td>2882.0</td>\n",
       "      <td>0.000151</td>\n",
       "      <td>0.336101</td>\n",
       "      <td>3011.0</td>\n",
       "      <td>0.000586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>125070</th>\n",
       "      <td>adrien-thomasson</td>\n",
       "      <td>Adrien Thomasson</td>\n",
       "      <td>Évian Thonon Gaillard</td>\n",
       "      <td>France</td>\n",
       "      <td>10.12.1993</td>\n",
       "      <td>182.0</td>\n",
       "      <td>75.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2672</td>\n",
       "      <td>7</td>\n",
       "      <td>FRA</td>\n",
       "      <td>0.334684</td>\n",
       "      <td>2882.0</td>\n",
       "      <td>0.000151</td>\n",
       "      <td>0.336101</td>\n",
       "      <td>3011.0</td>\n",
       "      <td>0.000586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>125306</th>\n",
       "      <td>mickael-charvet</td>\n",
       "      <td>Mickaël Charvet</td>\n",
       "      <td>AC Ajaccio</td>\n",
       "      <td>France</td>\n",
       "      <td>31.03.1988</td>\n",
       "      <td>179.0</td>\n",
       "      <td>77.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2672</td>\n",
       "      <td>7</td>\n",
       "      <td>FRA</td>\n",
       "      <td>0.334684</td>\n",
       "      <td>2882.0</td>\n",
       "      <td>0.000151</td>\n",
       "      <td>0.336101</td>\n",
       "      <td>3011.0</td>\n",
       "      <td>0.000586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>125413</th>\n",
       "      <td>yannick-aguemon</td>\n",
       "      <td>Yannick Aguemon</td>\n",
       "      <td>Toulouse FC</td>\n",
       "      <td>France</td>\n",
       "      <td>11.02.1992</td>\n",
       "      <td>180.0</td>\n",
       "      <td>74.0</td>\n",
       "      <td>Right Winger</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2672</td>\n",
       "      <td>7</td>\n",
       "      <td>FRA</td>\n",
       "      <td>0.334684</td>\n",
       "      <td>2882.0</td>\n",
       "      <td>0.000151</td>\n",
       "      <td>0.336101</td>\n",
       "      <td>3011.0</td>\n",
       "      <td>0.000586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>125519</th>\n",
       "      <td>cyril-hennion</td>\n",
       "      <td>Cyril Hennion</td>\n",
       "      <td>OGC Nice</td>\n",
       "      <td>France</td>\n",
       "      <td>03.01.1992</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2673</td>\n",
       "      <td>7</td>\n",
       "      <td>FRA</td>\n",
       "      <td>0.334684</td>\n",
       "      <td>2882.0</td>\n",
       "      <td>0.000151</td>\n",
       "      <td>0.336101</td>\n",
       "      <td>3011.0</td>\n",
       "      <td>0.000586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>129412</th>\n",
       "      <td>dylan-tombides</td>\n",
       "      <td>Dylan Tombides</td>\n",
       "      <td>West Ham United</td>\n",
       "      <td>England</td>\n",
       "      <td>08.03.1994</td>\n",
       "      <td>180.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Center Forward</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2791</td>\n",
       "      <td>32</td>\n",
       "      <td>CHE</td>\n",
       "      <td>0.345305</td>\n",
       "      <td>1886.0</td>\n",
       "      <td>0.000219</td>\n",
       "      <td>0.377193</td>\n",
       "      <td>1938.0</td>\n",
       "      <td>0.000823</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>130405</th>\n",
       "      <td>adrien-thomasson</td>\n",
       "      <td>Adrien Thomasson</td>\n",
       "      <td>Évian Thonon Gaillard</td>\n",
       "      <td>France</td>\n",
       "      <td>10.12.1993</td>\n",
       "      <td>182.0</td>\n",
       "      <td>75.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2796</td>\n",
       "      <td>7</td>\n",
       "      <td>FRA</td>\n",
       "      <td>0.334684</td>\n",
       "      <td>2882.0</td>\n",
       "      <td>0.000151</td>\n",
       "      <td>0.336101</td>\n",
       "      <td>3011.0</td>\n",
       "      <td>0.000586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>131014</th>\n",
       "      <td>quentin-pereira</td>\n",
       "      <td>Quentin Pereira</td>\n",
       "      <td>Stade Reims</td>\n",
       "      <td>France</td>\n",
       "      <td>21.04.1992</td>\n",
       "      <td>174.0</td>\n",
       "      <td>66.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2796</td>\n",
       "      <td>7</td>\n",
       "      <td>FRA</td>\n",
       "      <td>0.334684</td>\n",
       "      <td>2882.0</td>\n",
       "      <td>0.000151</td>\n",
       "      <td>0.336101</td>\n",
       "      <td>3011.0</td>\n",
       "      <td>0.000586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>131197</th>\n",
       "      <td>yannick-aguemon</td>\n",
       "      <td>Yannick Aguemon</td>\n",
       "      <td>Toulouse FC</td>\n",
       "      <td>France</td>\n",
       "      <td>11.02.1992</td>\n",
       "      <td>180.0</td>\n",
       "      <td>74.0</td>\n",
       "      <td>Right Winger</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2796</td>\n",
       "      <td>7</td>\n",
       "      <td>FRA</td>\n",
       "      <td>0.334684</td>\n",
       "      <td>2882.0</td>\n",
       "      <td>0.000151</td>\n",
       "      <td>0.336101</td>\n",
       "      <td>3011.0</td>\n",
       "      <td>0.000586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>139030</th>\n",
       "      <td>senah-mango</td>\n",
       "      <td>Sénah Mango</td>\n",
       "      <td>Olympique Marseille</td>\n",
       "      <td>France</td>\n",
       "      <td>13.12.1989</td>\n",
       "      <td>181.0</td>\n",
       "      <td>78.0</td>\n",
       "      <td>Center Back</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.00</td>\n",
       "      <td>2951</td>\n",
       "      <td>22</td>\n",
       "      <td>POL</td>\n",
       "      <td>0.369958</td>\n",
       "      <td>1021.0</td>\n",
       "      <td>0.000412</td>\n",
       "      <td>0.831268</td>\n",
       "      <td>1049.0</td>\n",
       "      <td>0.002034</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>139048</th>\n",
       "      <td>jeremy-frick</td>\n",
       "      <td>Jérémy Frick</td>\n",
       "      <td>Olympique Lyon</td>\n",
       "      <td>France</td>\n",
       "      <td>08.03.1993</td>\n",
       "      <td>190.0</td>\n",
       "      <td>90.0</td>\n",
       "      <td>Goalkeeper</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2953</td>\n",
       "      <td>32</td>\n",
       "      <td>CHE</td>\n",
       "      <td>0.345305</td>\n",
       "      <td>1886.0</td>\n",
       "      <td>0.000219</td>\n",
       "      <td>0.377193</td>\n",
       "      <td>1938.0</td>\n",
       "      <td>0.000823</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>139150</th>\n",
       "      <td>baptiste-aloe</td>\n",
       "      <td>Baptiste Aloe</td>\n",
       "      <td>Olympique Marseille</td>\n",
       "      <td>France</td>\n",
       "      <td>29.06.1994</td>\n",
       "      <td>184.0</td>\n",
       "      <td>77.0</td>\n",
       "      <td>Center Back</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>0.25</td>\n",
       "      <td>2960</td>\n",
       "      <td>80</td>\n",
       "      <td>FIN</td>\n",
       "      <td>0.396512</td>\n",
       "      <td>2273.0</td>\n",
       "      <td>0.000187</td>\n",
       "      <td>1.031407</td>\n",
       "      <td>2388.0</td>\n",
       "      <td>0.000931</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>139769</th>\n",
       "      <td>momar-bangoura</td>\n",
       "      <td>Momar Bangoura</td>\n",
       "      <td>Olympique Marseille</td>\n",
       "      <td>France</td>\n",
       "      <td>24.02.1994</td>\n",
       "      <td>176.0</td>\n",
       "      <td>65.0</td>\n",
       "      <td>Defensive Midfielder</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.00</td>\n",
       "      <td>2961</td>\n",
       "      <td>7</td>\n",
       "      <td>FRA</td>\n",
       "      <td>0.334684</td>\n",
       "      <td>2882.0</td>\n",
       "      <td>0.000151</td>\n",
       "      <td>0.336101</td>\n",
       "      <td>3011.0</td>\n",
       "      <td>0.000586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>139946</th>\n",
       "      <td>wesley-jobello</td>\n",
       "      <td>Wesley Jobello</td>\n",
       "      <td>Olympique Marseille</td>\n",
       "      <td>France</td>\n",
       "      <td>23.01.1994</td>\n",
       "      <td>179.0</td>\n",
       "      <td>68.0</td>\n",
       "      <td>Left Winger</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.75</td>\n",
       "      <td>2961</td>\n",
       "      <td>7</td>\n",
       "      <td>FRA</td>\n",
       "      <td>0.334684</td>\n",
       "      <td>2882.0</td>\n",
       "      <td>0.000151</td>\n",
       "      <td>0.336101</td>\n",
       "      <td>3011.0</td>\n",
       "      <td>0.000586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>141414</th>\n",
       "      <td>dylan-tombides</td>\n",
       "      <td>Dylan Tombides</td>\n",
       "      <td>West Ham United</td>\n",
       "      <td>England</td>\n",
       "      <td>08.03.1994</td>\n",
       "      <td>180.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Center Forward</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>3015</td>\n",
       "      <td>47</td>\n",
       "      <td>PER</td>\n",
       "      <td>0.398503</td>\n",
       "      <td>254.0</td>\n",
       "      <td>0.001682</td>\n",
       "      <td>0.697417</td>\n",
       "      <td>271.0</td>\n",
       "      <td>0.008707</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>143329</th>\n",
       "      <td>marc-vidal</td>\n",
       "      <td>Marc Vidal</td>\n",
       "      <td>Toulouse FC</td>\n",
       "      <td>France</td>\n",
       "      <td>03.06.1991</td>\n",
       "      <td>187.0</td>\n",
       "      <td>80.0</td>\n",
       "      <td>Goalkeeper</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>3077</td>\n",
       "      <td>7</td>\n",
       "      <td>FRA</td>\n",
       "      <td>0.334684</td>\n",
       "      <td>2882.0</td>\n",
       "      <td>0.000151</td>\n",
       "      <td>0.336101</td>\n",
       "      <td>3011.0</td>\n",
       "      <td>0.000586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>143408</th>\n",
       "      <td>quentin-pereira</td>\n",
       "      <td>Quentin Pereira</td>\n",
       "      <td>Stade Reims</td>\n",
       "      <td>France</td>\n",
       "      <td>21.04.1992</td>\n",
       "      <td>174.0</td>\n",
       "      <td>66.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>3077</td>\n",
       "      <td>7</td>\n",
       "      <td>FRA</td>\n",
       "      <td>0.334684</td>\n",
       "      <td>2882.0</td>\n",
       "      <td>0.000151</td>\n",
       "      <td>0.336101</td>\n",
       "      <td>3011.0</td>\n",
       "      <td>0.000586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>143973</th>\n",
       "      <td>johann-durand</td>\n",
       "      <td>Johann Durand</td>\n",
       "      <td>Évian Thonon Gaillard</td>\n",
       "      <td>France</td>\n",
       "      <td>17.06.1981</td>\n",
       "      <td>182.0</td>\n",
       "      <td>71.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>3080</td>\n",
       "      <td>7</td>\n",
       "      <td>FRA</td>\n",
       "      <td>0.334684</td>\n",
       "      <td>2882.0</td>\n",
       "      <td>0.000151</td>\n",
       "      <td>0.336101</td>\n",
       "      <td>3011.0</td>\n",
       "      <td>0.000586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>144067</th>\n",
       "      <td>alexander-ndoumbou</td>\n",
       "      <td>Alexander N'Doumbou</td>\n",
       "      <td>Olympique Marseille</td>\n",
       "      <td>France</td>\n",
       "      <td>04.01.1992</td>\n",
       "      <td>174.0</td>\n",
       "      <td>70.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.50</td>\n",
       "      <td>3083</td>\n",
       "      <td>30</td>\n",
       "      <td>COL</td>\n",
       "      <td>0.377402</td>\n",
       "      <td>601.0</td>\n",
       "      <td>0.000703</td>\n",
       "      <td>0.588050</td>\n",
       "      <td>636.0</td>\n",
       "      <td>0.003579</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>10149 rows × 28 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                  playerShort                 player                   club  \\\n",
       "0               lucas-wilchez          Lucas Wilchez          Real Zaragoza   \n",
       "1                  john-utaka             John Utaka        Montpellier HSC   \n",
       "2                 abdon-prats            Abdón Prats           RCD Mallorca   \n",
       "3                  pablo-mari             Pablo Marí           RCD Mallorca   \n",
       "4                  ruben-pena             Rubén Peña        Real Valladolid   \n",
       "5                aaron-hughes           Aaron Hughes              Fulham FC   \n",
       "6          aleksandar-kolarov     Aleksandar Kolarov        Manchester City   \n",
       "7            alexander-tettey       Alexander Tettey           Norwich City   \n",
       "8           anders-lindegaard      Anders Lindegaard      Manchester United   \n",
       "9                andreas-beck           Andreas Beck        1899 Hoffenheim   \n",
       "10           antonio-rukavina       Antonio Rukavina        Real Valladolid   \n",
       "11             ashkan-dejagah         Ashkan Dejagah              Fulham FC   \n",
       "12          benedikt-hoewedes       Benedikt Höwedes          FC Schalke 04   \n",
       "13                chris-baird            Chris Baird              Fulham FC   \n",
       "14                chris-brunt            Chris Brunt   West Bromwich Albion   \n",
       "15             daniel-schwaab         Daniel Schwaab       Bayer Leverkusen   \n",
       "16                dennis-aogo            Dennis Aogo           Hamburger SV   \n",
       "17           george-mccartney       George McCartney        West Ham United   \n",
       "18           gylfi-sigurdsson       Gylfi Sigurðsson      Tottenham Hotspur   \n",
       "19             ivan-obradovic         Ivan Obradović          Real Zaragoza   \n",
       "20                jan-moravek            Jan Morávek            FC Augsburg   \n",
       "21              jan-rosenthal          Jan Rosenthal            SC Freiburg   \n",
       "22                jonny-evans            Jonny Evans      Manchester United   \n",
       "23      kyriakos-papadopoulos  Kyriakos Papadopoulos          FC Schalke 04   \n",
       "24                marko-marin            Marko Marin             Chelsea FC   \n",
       "25               mats-hummels           Mats Hummels      Borussia Dortmund   \n",
       "26                mesut-oezil             Mesut Özil            Real Madrid   \n",
       "27            milorad-pekovic        Milorad Peković   SpVgg Greuther Fürth   \n",
       "28              nemanja-vidic          Nemanja Vidić      Manchester United   \n",
       "29              neven-subotic          Neven Subotić      Borussia Dortmund   \n",
       "...                       ...                    ...                    ...   \n",
       "117353         dylan-tombides         Dylan Tombides        West Ham United   \n",
       "118341         dominic-samuel         Dominic Samuel             Reading FC   \n",
       "120273      jonathan-santiago      Jonathan Santiago    Olympique Marseille   \n",
       "120280             kevin-osei             Kevin Osei    Olympique Marseille   \n",
       "120377         wesley-jobello         Wesley Jobello    Olympique Marseille   \n",
       "121494           stephen-sama           Stephen Sama           Liverpool FC   \n",
       "121798       adrien-thomasson       Adrien Thomasson  Évian Thonon Gaillard   \n",
       "121911          diacko-fofana          Diacko Fofana               OGC Nice   \n",
       "121939  fabien-dao-castellana  Fabien Dao Castellana               OGC Nice   \n",
       "122216          petrus-boumal          Petrus Boumal             FC Sochaux   \n",
       "122915          cyril-hennion          Cyril Hennion               OGC Nice   \n",
       "123250         pape-coulibaly         Pape Coulibaly       AS Saint-Étienne   \n",
       "125070       adrien-thomasson       Adrien Thomasson  Évian Thonon Gaillard   \n",
       "125306        mickael-charvet        Mickaël Charvet             AC Ajaccio   \n",
       "125413        yannick-aguemon        Yannick Aguemon            Toulouse FC   \n",
       "125519          cyril-hennion          Cyril Hennion               OGC Nice   \n",
       "129412         dylan-tombides         Dylan Tombides        West Ham United   \n",
       "130405       adrien-thomasson       Adrien Thomasson  Évian Thonon Gaillard   \n",
       "131014        quentin-pereira        Quentin Pereira            Stade Reims   \n",
       "131197        yannick-aguemon        Yannick Aguemon            Toulouse FC   \n",
       "139030            senah-mango            Sénah Mango    Olympique Marseille   \n",
       "139048           jeremy-frick           Jérémy Frick         Olympique Lyon   \n",
       "139150          baptiste-aloe          Baptiste Aloe    Olympique Marseille   \n",
       "139769         momar-bangoura         Momar Bangoura    Olympique Marseille   \n",
       "139946         wesley-jobello         Wesley Jobello    Olympique Marseille   \n",
       "141414         dylan-tombides         Dylan Tombides        West Ham United   \n",
       "143329             marc-vidal             Marc Vidal            Toulouse FC   \n",
       "143408        quentin-pereira        Quentin Pereira            Stade Reims   \n",
       "143973          johann-durand          Johann Durand  Évian Thonon Gaillard   \n",
       "144067     alexander-ndoumbou    Alexander N'Doumbou    Olympique Marseille   \n",
       "\n",
       "       leagueCountry    birthday  height  weight              position  games  \\\n",
       "0              Spain  31.08.1983   177.0    72.0  Attacking Midfielder      1   \n",
       "1             France  08.01.1982   179.0    82.0          Right Winger      1   \n",
       "2              Spain  17.12.1992   181.0    79.0                   NaN      1   \n",
       "3              Spain  31.08.1993   191.0    87.0           Center Back      1   \n",
       "4              Spain  18.07.1991   172.0    70.0      Right Midfielder      1   \n",
       "5            England  08.11.1979   182.0    71.0           Center Back      1   \n",
       "6            England  10.11.1985   187.0    80.0         Left Fullback      1   \n",
       "7            England  04.04.1986   180.0    68.0  Defensive Midfielder      1   \n",
       "8            England  13.04.1984   193.0    80.0            Goalkeeper      1   \n",
       "9            Germany  13.03.1987   180.0    70.0        Right Fullback      1   \n",
       "10             Spain  26.01.1984   177.0    74.0        Right Fullback      2   \n",
       "11           England  05.07.1986   181.0    74.0           Left Winger      1   \n",
       "12           Germany  29.02.1988   187.0    80.0           Center Back      1   \n",
       "13           England  25.02.1982   186.0    77.0  Defensive Midfielder      1   \n",
       "14           England  14.12.1984   185.0    74.0                   NaN      1   \n",
       "15           Germany  23.08.1988   186.0    76.0        Right Fullback      1   \n",
       "16           Germany  14.01.1987   184.0    85.0         Left Fullback      1   \n",
       "17           England  29.04.1981   180.0    74.0         Left Fullback      1   \n",
       "18           England  08.09.1989   186.0    77.0  Attacking Midfielder      1   \n",
       "19             Spain  25.07.1988   181.0    74.0         Left Fullback      1   \n",
       "20           Germany  01.11.1989   180.0    75.0  Attacking Midfielder      1   \n",
       "21           Germany  07.04.1986   186.0    76.0  Attacking Midfielder      1   \n",
       "22           England  02.01.1988   188.0    77.0           Center Back      1   \n",
       "23           Germany  23.02.1992   183.0    85.0           Center Back      1   \n",
       "24           England  13.03.1989   170.0    63.0  Attacking Midfielder      1   \n",
       "25           Germany  16.12.1988   192.0    90.0           Center Back      1   \n",
       "26             Spain  15.10.1988   183.0    76.0  Attacking Midfielder      1   \n",
       "27           Germany  05.08.1977   189.0    88.0  Defensive Midfielder      1   \n",
       "28           England  21.10.1981   188.0    82.0           Center Back      2   \n",
       "29           Germany  10.12.1988   193.0    88.0           Center Back      2   \n",
       "...              ...         ...     ...     ...                   ...    ...   \n",
       "117353       England  08.03.1994   180.0     NaN        Center Forward      1   \n",
       "118341       England  01.04.1994   182.0     NaN                   NaN      1   \n",
       "120273        France  09.06.1994   167.0    59.0                   NaN      1   \n",
       "120280        France  26.03.1991   173.0    71.0                   NaN      1   \n",
       "120377        France  23.01.1994   179.0    68.0           Left Winger      1   \n",
       "121494       England  05.03.1993   188.0     NaN                   NaN      1   \n",
       "121798        France  10.12.1993   182.0    75.0                   NaN      1   \n",
       "121911        France  29.07.1994   175.0    71.0                   NaN      1   \n",
       "121939        France  28.07.1993   169.0    61.0  Defensive Midfielder      1   \n",
       "122216        France  20.04.1993   175.0    75.0     Center Midfielder      1   \n",
       "122915        France  03.01.1992     NaN     NaN                   NaN      1   \n",
       "123250        France  02.03.1988   194.0    92.0            Goalkeeper      1   \n",
       "125070        France  10.12.1993   182.0    75.0                   NaN      1   \n",
       "125306        France  31.03.1988   179.0    77.0                   NaN      1   \n",
       "125413        France  11.02.1992   180.0    74.0          Right Winger      1   \n",
       "125519        France  03.01.1992     NaN     NaN                   NaN      1   \n",
       "129412       England  08.03.1994   180.0     NaN        Center Forward      1   \n",
       "130405        France  10.12.1993   182.0    75.0                   NaN      1   \n",
       "131014        France  21.04.1992   174.0    66.0                   NaN      1   \n",
       "131197        France  11.02.1992   180.0    74.0          Right Winger      1   \n",
       "139030        France  13.12.1989   181.0    78.0           Center Back      1   \n",
       "139048        France  08.03.1993   190.0    90.0            Goalkeeper      1   \n",
       "139150        France  29.06.1994   184.0    77.0           Center Back      1   \n",
       "139769        France  24.02.1994   176.0    65.0  Defensive Midfielder      1   \n",
       "139946        France  23.01.1994   179.0    68.0           Left Winger      1   \n",
       "141414       England  08.03.1994   180.0     NaN        Center Forward      1   \n",
       "143329        France  03.06.1991   187.0    80.0            Goalkeeper      1   \n",
       "143408        France  21.04.1992   174.0    66.0                   NaN      1   \n",
       "143973        France  17.06.1981   182.0    71.0                   NaN      1   \n",
       "144067        France  04.01.1992   174.0    70.0                   NaN      1   \n",
       "\n",
       "        victories  ...  rater2  refNum  refCountry  Alpha_3   meanIAT  \\\n",
       "0               0  ...    0.50       1           1      GRC  0.326391   \n",
       "1               0  ...    0.75       2           2      ZMB  0.203375   \n",
       "2               0  ...     NaN       3           3      ESP  0.369894   \n",
       "3               1  ...     NaN       3           3      ESP  0.369894   \n",
       "4               1  ...     NaN       3           3      ESP  0.369894   \n",
       "5               0  ...    0.00       4           4      LUX  0.325185   \n",
       "6               1  ...    0.25       4           4      LUX  0.325185   \n",
       "7               0  ...    1.00       4           4      LUX  0.325185   \n",
       "8               0  ...    0.25       4           4      LUX  0.325185   \n",
       "9               1  ...    0.00       4           4      LUX  0.325185   \n",
       "10              2  ...    0.00       4           4      LUX  0.325185   \n",
       "11              1  ...    0.50       4           4      LUX  0.325185   \n",
       "12              1  ...    0.00       4           4      LUX  0.325185   \n",
       "13              0  ...    0.00       4           4      LUX  0.325185   \n",
       "14              0  ...    0.25       4           4      LUX  0.325185   \n",
       "15              1  ...    0.00       4           4      LUX  0.325185   \n",
       "16              1  ...    0.50       4           4      LUX  0.325185   \n",
       "17              0  ...    0.00       4           4      LUX  0.325185   \n",
       "18              0  ...    0.00       4           4      LUX  0.325185   \n",
       "19              1  ...    0.25       4           4      LUX  0.325185   \n",
       "20              0  ...    0.25       4           4      LUX  0.325185   \n",
       "21              1  ...    0.00       4           4      LUX  0.325185   \n",
       "22              0  ...    0.00       4           4      LUX  0.325185   \n",
       "23              0  ...    0.00       4           4      LUX  0.325185   \n",
       "24              1  ...    0.00       4           4      LUX  0.325185   \n",
       "25              1  ...    0.25       4           4      LUX  0.325185   \n",
       "26              1  ...    0.25       4           4      LUX  0.325185   \n",
       "27              1  ...    0.25       4           4      LUX  0.325185   \n",
       "28              2  ...    0.00       4           4      LUX  0.325185   \n",
       "29              2  ...    0.25       4           4      LUX  0.325185   \n",
       "...           ...  ...     ...     ...         ...      ...       ...   \n",
       "117353          1  ...     NaN    2481         100      JAM  0.109661   \n",
       "118341          0  ...     NaN    2517          44     ENGL  0.326690   \n",
       "120273          0  ...     NaN    2546          57      AUT  0.337539   \n",
       "120280          0  ...    0.75    2546          57      AUT  0.337539   \n",
       "120377          0  ...    0.75    2546          57      AUT  0.337539   \n",
       "121494          0  ...     NaN    2572          64      NLD  0.352920   \n",
       "121798          1  ...     NaN    2598           7      FRA  0.334684   \n",
       "121911          0  ...     NaN    2598           7      FRA  0.334684   \n",
       "121939          0  ...     NaN    2598           7      FRA  0.334684   \n",
       "122216          0  ...     NaN    2598           7      FRA  0.334684   \n",
       "122915          0  ...     NaN    2619           7      FRA  0.334684   \n",
       "123250          1  ...     NaN    2619           7      FRA  0.334684   \n",
       "125070          0  ...     NaN    2672           7      FRA  0.334684   \n",
       "125306          0  ...     NaN    2672           7      FRA  0.334684   \n",
       "125413          0  ...     NaN    2672           7      FRA  0.334684   \n",
       "125519          0  ...     NaN    2673           7      FRA  0.334684   \n",
       "129412          0  ...     NaN    2791          32      CHE  0.345305   \n",
       "130405          0  ...     NaN    2796           7      FRA  0.334684   \n",
       "131014          0  ...     NaN    2796           7      FRA  0.334684   \n",
       "131197          0  ...     NaN    2796           7      FRA  0.334684   \n",
       "139030          0  ...    1.00    2951          22      POL  0.369958   \n",
       "139048          0  ...     NaN    2953          32      CHE  0.345305   \n",
       "139150          1  ...    0.25    2960          80      FIN  0.396512   \n",
       "139769          0  ...    1.00    2961           7      FRA  0.334684   \n",
       "139946          0  ...    0.75    2961           7      FRA  0.334684   \n",
       "141414          0  ...     NaN    3015          47      PER  0.398503   \n",
       "143329          0  ...     NaN    3077           7      FRA  0.334684   \n",
       "143408          0  ...     NaN    3077           7      FRA  0.334684   \n",
       "143973          0  ...     NaN    3080           7      FRA  0.334684   \n",
       "144067          0  ...    0.50    3083          30      COL  0.377402   \n",
       "\n",
       "           nIAT     seIAT   meanExp     nExp     seExp  \n",
       "0         712.0  0.000564  0.396000    750.0  0.002696  \n",
       "1          40.0  0.010875 -0.204082     49.0  0.061504  \n",
       "2        1785.0  0.000229  0.588297   1897.0  0.001002  \n",
       "3        1785.0  0.000229  0.588297   1897.0  0.001002  \n",
       "4        1785.0  0.000229  0.588297   1897.0  0.001002  \n",
       "5         127.0  0.003297  0.538462    130.0  0.013752  \n",
       "6         127.0  0.003297  0.538462    130.0  0.013752  \n",
       "7         127.0  0.003297  0.538462    130.0  0.013752  \n",
       "8         127.0  0.003297  0.538462    130.0  0.013752  \n",
       "9         127.0  0.003297  0.538462    130.0  0.013752  \n",
       "10        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "11        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "12        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "13        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "14        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "15        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "16        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "17        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "18        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "19        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "20        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "21        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "22        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "23        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "24        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "25        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "26        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "27        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "28        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "29        127.0  0.003297  0.538462    130.0  0.013752  \n",
       "...         ...       ...       ...      ...       ...  \n",
       "117353    517.0  0.000881 -1.076350    537.0  0.004087  \n",
       "118341  44791.0  0.000010  0.356446  46916.0  0.000037  \n",
       "120273   1319.0  0.000331  0.394139   1365.0  0.001717  \n",
       "120280   1319.0  0.000331  0.394139   1365.0  0.001717  \n",
       "120377   1319.0  0.000331  0.394139   1365.0  0.001717  \n",
       "121494   5952.0  0.000070  0.445679   6121.0  0.000269  \n",
       "121798   2882.0  0.000151  0.336101   3011.0  0.000586  \n",
       "121911   2882.0  0.000151  0.336101   3011.0  0.000586  \n",
       "121939   2882.0  0.000151  0.336101   3011.0  0.000586  \n",
       "122216   2882.0  0.000151  0.336101   3011.0  0.000586  \n",
       "122915   2882.0  0.000151  0.336101   3011.0  0.000586  \n",
       "123250   2882.0  0.000151  0.336101   3011.0  0.000586  \n",
       "125070   2882.0  0.000151  0.336101   3011.0  0.000586  \n",
       "125306   2882.0  0.000151  0.336101   3011.0  0.000586  \n",
       "125413   2882.0  0.000151  0.336101   3011.0  0.000586  \n",
       "125519   2882.0  0.000151  0.336101   3011.0  0.000586  \n",
       "129412   1886.0  0.000219  0.377193   1938.0  0.000823  \n",
       "130405   2882.0  0.000151  0.336101   3011.0  0.000586  \n",
       "131014   2882.0  0.000151  0.336101   3011.0  0.000586  \n",
       "131197   2882.0  0.000151  0.336101   3011.0  0.000586  \n",
       "139030   1021.0  0.000412  0.831268   1049.0  0.002034  \n",
       "139048   1886.0  0.000219  0.377193   1938.0  0.000823  \n",
       "139150   2273.0  0.000187  1.031407   2388.0  0.000931  \n",
       "139769   2882.0  0.000151  0.336101   3011.0  0.000586  \n",
       "139946   2882.0  0.000151  0.336101   3011.0  0.000586  \n",
       "141414    254.0  0.001682  0.697417    271.0  0.008707  \n",
       "143329   2882.0  0.000151  0.336101   3011.0  0.000586  \n",
       "143408   2882.0  0.000151  0.336101   3011.0  0.000586  \n",
       "143973   2882.0  0.000151  0.336101   3011.0  0.000586  \n",
       "144067    601.0  0.000703  0.588050    636.0  0.003579  \n",
       "\n",
       "[10149 rows x 28 columns]"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "all_cols_unique_players = df.groupby('playerShort')\n",
    "all_cols_unique_players.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Count the unique variables (if we got different weight values, \n",
    "# for example, then we should get more than one unique value in this groupby)\n",
    "all_cols_unique_players = df.groupby('playerShort').agg({col:'nunique' for col in player_cols})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>birthday</th>\n",
       "      <th>height</th>\n",
       "      <th>weight</th>\n",
       "      <th>position</th>\n",
       "      <th>photoID</th>\n",
       "      <th>rater1</th>\n",
       "      <th>rater2</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>playerShort</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>aaron-hughes</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>aaron-hunt</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>aaron-lennon</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>aaron-ramsey</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>abdelhamid-el-kaoutari</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                        birthday  height  weight  position  photoID  rater1  \\\n",
       "playerShort                                                                   \n",
       "aaron-hughes                   1       1       1         1        1       1   \n",
       "aaron-hunt                     1       1       1         1        1       1   \n",
       "aaron-lennon                   1       1       1         1        1       1   \n",
       "aaron-ramsey                   1       1       1         1        1       1   \n",
       "abdelhamid-el-kaoutari         1       1       1         1        1       1   \n",
       "\n",
       "                        rater2  \n",
       "playerShort                     \n",
       "aaron-hughes                 1  \n",
       "aaron-hunt                   1  \n",
       "aaron-lennon                 1  \n",
       "aaron-ramsey                 1  \n",
       "abdelhamid-el-kaoutari       1  "
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "all_cols_unique_players.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>birthday</th>\n",
       "      <th>height</th>\n",
       "      <th>weight</th>\n",
       "      <th>position</th>\n",
       "      <th>photoID</th>\n",
       "      <th>rater1</th>\n",
       "      <th>rater2</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>playerShort</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "Empty DataFrame\n",
       "Columns: [birthday, height, weight, position, photoID, rater1, rater2]\n",
       "Index: []"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# If all values are the same per player then this should be empty (and it is!)\n",
    "all_cols_unique_players[all_cols_unique_players > 1].dropna().head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "\n",
    "# A slightly more elegant way to test the uniqueness\n",
    "all_cols_unique_players[all_cols_unique_players > 1].dropna().shape[0] == 0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_subgroup(dataframe, g_index, g_columns):\n",
    "    \n",
    "    \"\"\"Helper function that creates a sub-table from the columns and runs a quick uniqueness test.\"\"\"\n",
    "    g = dataframe.groupby(g_index).agg({col:'nunique' for col in g_columns})\n",
    "    if g[g > 1].dropna().shape[0] != 0:\n",
    "        print(\"Warning: you probably assumed this had all unique values but it doesn't.\")\n",
    "    return dataframe.groupby(g_index).agg({col:'max' for col in g_columns})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>birthday</th>\n",
       "      <th>height</th>\n",
       "      <th>weight</th>\n",
       "      <th>position</th>\n",
       "      <th>photoID</th>\n",
       "      <th>rater1</th>\n",
       "      <th>rater2</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>playerShort</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>aaron-hughes</th>\n",
       "      <td>08.11.1979</td>\n",
       "      <td>182.0</td>\n",
       "      <td>71.0</td>\n",
       "      <td>Center Back</td>\n",
       "      <td>3868.jpg</td>\n",
       "      <td>0.25</td>\n",
       "      <td>0.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>aaron-hunt</th>\n",
       "      <td>04.09.1986</td>\n",
       "      <td>183.0</td>\n",
       "      <td>73.0</td>\n",
       "      <td>Attacking Midfielder</td>\n",
       "      <td>20136.jpg</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>aaron-lennon</th>\n",
       "      <td>16.04.1987</td>\n",
       "      <td>165.0</td>\n",
       "      <td>63.0</td>\n",
       "      <td>Right Midfielder</td>\n",
       "      <td>13515.jpg</td>\n",
       "      <td>0.25</td>\n",
       "      <td>0.25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>aaron-ramsey</th>\n",
       "      <td>26.12.1990</td>\n",
       "      <td>178.0</td>\n",
       "      <td>76.0</td>\n",
       "      <td>Center Midfielder</td>\n",
       "      <td>94953.jpg</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>abdelhamid-el-kaoutari</th>\n",
       "      <td>17.03.1990</td>\n",
       "      <td>180.0</td>\n",
       "      <td>73.0</td>\n",
       "      <td>Center Back</td>\n",
       "      <td>124913.jpg</td>\n",
       "      <td>0.25</td>\n",
       "      <td>0.25</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                          birthday  height  weight              position  \\\n",
       "playerShort                                                                \n",
       "aaron-hughes            08.11.1979   182.0    71.0           Center Back   \n",
       "aaron-hunt              04.09.1986   183.0    73.0  Attacking Midfielder   \n",
       "aaron-lennon            16.04.1987   165.0    63.0      Right Midfielder   \n",
       "aaron-ramsey            26.12.1990   178.0    76.0     Center Midfielder   \n",
       "abdelhamid-el-kaoutari  17.03.1990   180.0    73.0           Center Back   \n",
       "\n",
       "                           photoID  rater1  rater2  \n",
       "playerShort                                         \n",
       "aaron-hughes              3868.jpg    0.25    0.00  \n",
       "aaron-hunt               20136.jpg    0.00    0.25  \n",
       "aaron-lennon             13515.jpg    0.25    0.25  \n",
       "aaron-ramsey             94953.jpg    0.00    0.00  \n",
       "abdelhamid-el-kaoutari  124913.jpg    0.25    0.25  "
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "players = get_subgroup(df, player_index, player_cols)\n",
    "players.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "def save_subgroup(dataframe, g_index, subgroup_name, prefix='raw_'):\n",
    "    save_subgroup_filename = \"\".join([prefix, subgroup_name, \".csv.gz\"])\n",
    "    dataframe.to_csv(save_subgroup_filename, compression='gzip', encoding='UTF-8')\n",
    "    test_df = pd.read_csv(save_subgroup_filename, compression='gzip', index_col=g_index, encoding='UTF-8')\n",
    "    # Test that we recover what we send in\n",
    "    if dataframe.equals(test_df):\n",
    "        print(\"Test-passed: we recover the equivalent subgroup dataframe.\")\n",
    "    else:\n",
    "        print(\"Warning -- equivalence test!!! Double-check.\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>birthday</th>\n",
       "      <th>height</th>\n",
       "      <th>weight</th>\n",
       "      <th>position</th>\n",
       "      <th>photoID</th>\n",
       "      <th>rater1</th>\n",
       "      <th>rater2</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>playerShort</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>aaron-hughes</th>\n",
       "      <td>08.11.1979</td>\n",
       "      <td>182.0</td>\n",
       "      <td>71.0</td>\n",
       "      <td>Center Back</td>\n",
       "      <td>3868.jpg</td>\n",
       "      <td>0.25</td>\n",
       "      <td>0.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>aaron-hunt</th>\n",
       "      <td>04.09.1986</td>\n",
       "      <td>183.0</td>\n",
       "      <td>73.0</td>\n",
       "      <td>Attacking Midfielder</td>\n",
       "      <td>20136.jpg</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>aaron-lennon</th>\n",
       "      <td>16.04.1987</td>\n",
       "      <td>165.0</td>\n",
       "      <td>63.0</td>\n",
       "      <td>Right Midfielder</td>\n",
       "      <td>13515.jpg</td>\n",
       "      <td>0.25</td>\n",
       "      <td>0.25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>aaron-ramsey</th>\n",
       "      <td>26.12.1990</td>\n",
       "      <td>178.0</td>\n",
       "      <td>76.0</td>\n",
       "      <td>Center Midfielder</td>\n",
       "      <td>94953.jpg</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>abdelhamid-el-kaoutari</th>\n",
       "      <td>17.03.1990</td>\n",
       "      <td>180.0</td>\n",
       "      <td>73.0</td>\n",
       "      <td>Center Back</td>\n",
       "      <td>124913.jpg</td>\n",
       "      <td>0.25</td>\n",
       "      <td>0.25</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                          birthday  height  weight              position  \\\n",
       "playerShort                                                                \n",
       "aaron-hughes            08.11.1979   182.0    71.0           Center Back   \n",
       "aaron-hunt              04.09.1986   183.0    73.0  Attacking Midfielder   \n",
       "aaron-lennon            16.04.1987   165.0    63.0      Right Midfielder   \n",
       "aaron-ramsey            26.12.1990   178.0    76.0     Center Midfielder   \n",
       "abdelhamid-el-kaoutari  17.03.1990   180.0    73.0           Center Back   \n",
       "\n",
       "                           photoID  rater1  rater2  \n",
       "playerShort                                         \n",
       "aaron-hughes              3868.jpg    0.25    0.00  \n",
       "aaron-hunt               20136.jpg    0.00    0.25  \n",
       "aaron-lennon             13515.jpg    0.25    0.25  \n",
       "aaron-ramsey             94953.jpg    0.00    0.00  \n",
       "abdelhamid-el-kaoutari  124913.jpg    0.25    0.25  "
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "players = get_subgroup(df, player_index, player_cols)\n",
    "players.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Test-passed: we recover the equivalent subgroup dataframe.\n"
     ]
    }
   ],
   "source": [
    "save_subgroup(players, player_index, \"players\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create Tidy Clubs Table\n",
    "\n",
    "Create the clubs table."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>leagueCountry</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>club</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1. FC Nürnberg</th>\n",
       "      <td>Germany</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1. FSV Mainz 05</th>\n",
       "      <td>Germany</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1899 Hoffenheim</th>\n",
       "      <td>Germany</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AC Ajaccio</th>\n",
       "      <td>France</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AFC Bournemouth</th>\n",
       "      <td>England</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                leagueCountry\n",
       "club                         \n",
       "1. FC Nürnberg        Germany\n",
       "1. FSV Mainz 05       Germany\n",
       "1899 Hoffenheim       Germany\n",
       "AC Ajaccio             France\n",
       "AFC Bournemouth       England"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "club_index = 'club'\n",
    "club_cols = ['leagueCountry']\n",
    "clubs = get_subgroup(df, club_index, club_cols)\n",
    "clubs.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "England    48\n",
       "Spain      27\n",
       "France     22\n",
       "Germany    21\n",
       "Name: leagueCountry, dtype: int64"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clubs['leagueCountry'].value_counts()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Test-passed: we recover the equivalent subgroup dataframe.\n"
     ]
    }
   ],
   "source": [
    "save_subgroup(clubs, club_index, \"clubs\", )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create Tidy Referees Table"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>refCountry</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>refNum</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        refCountry\n",
       "refNum            \n",
       "1                1\n",
       "2                2\n",
       "3                3\n",
       "4                4\n",
       "5                5"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "referee_index = 'refNum'\n",
    "referee_cols = ['refCountry']\n",
    "referees = get_subgroup(df, referee_index, referee_cols)\n",
    "referees.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "161"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "referees.refCountry.nunique()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>refCountry</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>refNum</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>3143</th>\n",
       "      <td>51</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3144</th>\n",
       "      <td>55</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3145</th>\n",
       "      <td>21</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3146</th>\n",
       "      <td>51</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3147</th>\n",
       "      <td>21</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        refCountry\n",
       "refNum            \n",
       "3143            51\n",
       "3144            55\n",
       "3145            21\n",
       "3146            51\n",
       "3147            21"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "referees.tail()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(3147, 1)"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "referees.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Test-passed: we recover the equivalent subgroup dataframe.\n"
     ]
    }
   ],
   "source": [
    "save_subgroup(referees, referee_index, \"referees\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create Tidy Countries Table"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Alpha_3</th>\n",
       "      <th>meanIAT</th>\n",
       "      <th>nIAT</th>\n",
       "      <th>meanExp</th>\n",
       "      <th>seExp</th>\n",
       "      <th>seIAT</th>\n",
       "      <th>nExp</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>refCountry</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>GRC</td>\n",
       "      <td>0.326391</td>\n",
       "      <td>712.0</td>\n",
       "      <td>0.396000</td>\n",
       "      <td>0.002696</td>\n",
       "      <td>0.000564</td>\n",
       "      <td>750.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>ZMB</td>\n",
       "      <td>0.203375</td>\n",
       "      <td>40.0</td>\n",
       "      <td>-0.204082</td>\n",
       "      <td>0.061504</td>\n",
       "      <td>0.010875</td>\n",
       "      <td>49.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>ESP</td>\n",
       "      <td>0.369894</td>\n",
       "      <td>1785.0</td>\n",
       "      <td>0.588297</td>\n",
       "      <td>0.001002</td>\n",
       "      <td>0.000229</td>\n",
       "      <td>1897.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>0.013752</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>130.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>TUN</td>\n",
       "      <td>0.167132</td>\n",
       "      <td>19.0</td>\n",
       "      <td>-0.789474</td>\n",
       "      <td>0.111757</td>\n",
       "      <td>0.027327</td>\n",
       "      <td>19.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           Alpha_3   meanIAT    nIAT   meanExp     seExp     seIAT    nExp\n",
       "refCountry                                                                \n",
       "1              GRC  0.326391   712.0  0.396000  0.002696  0.000564   750.0\n",
       "2              ZMB  0.203375    40.0 -0.204082  0.061504  0.010875    49.0\n",
       "3              ESP  0.369894  1785.0  0.588297  0.001002  0.000229  1897.0\n",
       "4              LUX  0.325185   127.0  0.538462  0.013752  0.003297   130.0\n",
       "5              TUN  0.167132    19.0 -0.789474  0.111757  0.027327    19.0"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "country_index = 'refCountry'\n",
    "country_cols = ['Alpha_3', # rename this name of country\n",
    "                'meanIAT',\n",
    "                'nIAT',\n",
    "                'seIAT',\n",
    "                'meanExp',\n",
    "                'nExp',\n",
    "                'seExp',\n",
    "               ]\n",
    "countries = get_subgroup(df, country_index, country_cols)\n",
    "countries.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>countryName</th>\n",
       "      <th>meanIAT</th>\n",
       "      <th>nIAT</th>\n",
       "      <th>meanExp</th>\n",
       "      <th>seExp</th>\n",
       "      <th>seIAT</th>\n",
       "      <th>nExp</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>refCountry</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>GRC</td>\n",
       "      <td>0.326391</td>\n",
       "      <td>712.0</td>\n",
       "      <td>0.396000</td>\n",
       "      <td>0.002696</td>\n",
       "      <td>0.000564</td>\n",
       "      <td>750.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>ZMB</td>\n",
       "      <td>0.203375</td>\n",
       "      <td>40.0</td>\n",
       "      <td>-0.204082</td>\n",
       "      <td>0.061504</td>\n",
       "      <td>0.010875</td>\n",
       "      <td>49.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>ESP</td>\n",
       "      <td>0.369894</td>\n",
       "      <td>1785.0</td>\n",
       "      <td>0.588297</td>\n",
       "      <td>0.001002</td>\n",
       "      <td>0.000229</td>\n",
       "      <td>1897.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>LUX</td>\n",
       "      <td>0.325185</td>\n",
       "      <td>127.0</td>\n",
       "      <td>0.538462</td>\n",
       "      <td>0.013752</td>\n",
       "      <td>0.003297</td>\n",
       "      <td>130.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>TUN</td>\n",
       "      <td>0.167132</td>\n",
       "      <td>19.0</td>\n",
       "      <td>-0.789474</td>\n",
       "      <td>0.111757</td>\n",
       "      <td>0.027327</td>\n",
       "      <td>19.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           countryName   meanIAT    nIAT   meanExp     seExp     seIAT    nExp\n",
       "refCountry                                                                    \n",
       "1                  GRC  0.326391   712.0  0.396000  0.002696  0.000564   750.0\n",
       "2                  ZMB  0.203375    40.0 -0.204082  0.061504  0.010875    49.0\n",
       "3                  ESP  0.369894  1785.0  0.588297  0.001002  0.000229  1897.0\n",
       "4                  LUX  0.325185   127.0  0.538462  0.013752  0.003297   130.0\n",
       "5                  TUN  0.167132    19.0 -0.789474  0.111757  0.027327    19.0"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rename_columns = {'Alpha_3':'countryName', }\n",
    "countries = countries.rename(columns=rename_columns)\n",
    "countries.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(161, 7)"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "countries.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Warning -- equivalence test!!! Double-check.\n"
     ]
    }
   ],
   "source": [
    "save_subgroup(countries, country_index, \"countries\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create separate (not yet Tidy) Dyads Table\n",
    "\n",
    "This is one of the more complex tables to reason about -- so we'll save it for a bit later. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "dyad_index = ['refNum', 'playerShort']\n",
    "dyad_cols = ['games',\n",
    "             'victories',\n",
    "             'ties',\n",
    "             'defeats',\n",
    "             'goals',\n",
    "             'yellowCards',\n",
    "             'yellowReds',\n",
    "             'redCards',\n",
    "            ]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "dyads = get_subgroup(df, g_index=dyad_index, g_columns=dyad_cols)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>redCards</th>\n",
       "      <th>victories</th>\n",
       "      <th>defeats</th>\n",
       "      <th>goals</th>\n",
       "      <th>games</th>\n",
       "      <th>yellowCards</th>\n",
       "      <th>ties</th>\n",
       "      <th>yellowReds</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>refNum</th>\n",
       "      <th>playerShort</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <th>lucas-wilchez</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <th>john-utaka</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"3\" valign=\"top\">3</th>\n",
       "      <th>abdon-prats</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>pablo-mari</th>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ruben-pena</th>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                      redCards  victories  defeats  goals  games  yellowCards  \\\n",
       "refNum playerShort                                                              \n",
       "1      lucas-wilchez         0          0        1      0      1            0   \n",
       "2      john-utaka            0          0        1      0      1            1   \n",
       "3      abdon-prats           0          0        0      0      1            1   \n",
       "       pablo-mari            0          1        0      0      1            0   \n",
       "       ruben-pena            0          1        0      0      1            0   \n",
       "\n",
       "                      ties  yellowReds  \n",
       "refNum playerShort                      \n",
       "1      lucas-wilchez     0           0  \n",
       "2      john-utaka        0           0  \n",
       "3      abdon-prats       1           0  \n",
       "       pablo-mari        0           0  \n",
       "       ruben-pena        0           0  "
      ]
     },
     "execution_count": 64,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dyads.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(146028, 8)"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dyads.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>yellowCards</th>\n",
       "      <th>yellowReds</th>\n",
       "      <th>victories</th>\n",
       "      <th>ties</th>\n",
       "      <th>games</th>\n",
       "      <th>defeats</th>\n",
       "      <th>goals</th>\n",
       "      <th>redCards</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>refNum</th>\n",
       "      <th>playerShort</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>140</th>\n",
       "      <th>bodipo</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>6</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>367</th>\n",
       "      <th>antonio-lopez_2</th>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "      <td>2</td>\n",
       "      <td>8</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">432</th>\n",
       "      <th>javi-martinez</th>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>4</td>\n",
       "      <td>3</td>\n",
       "      <td>14</td>\n",
       "      <td>7</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>jonas</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>4</td>\n",
       "      <td>9</td>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>487</th>\n",
       "      <th>phil-jagielka</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>7</td>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>586</th>\n",
       "      <th>cyril-jeunechamp</th>\n",
       "      <td>6</td>\n",
       "      <td>0</td>\n",
       "      <td>8</td>\n",
       "      <td>0</td>\n",
       "      <td>14</td>\n",
       "      <td>6</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>804</th>\n",
       "      <th>sergio-ramos</th>\n",
       "      <td>6</td>\n",
       "      <td>1</td>\n",
       "      <td>12</td>\n",
       "      <td>1</td>\n",
       "      <td>18</td>\n",
       "      <td>5</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>985</th>\n",
       "      <th>aly-cissokho</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>9</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1114</th>\n",
       "      <th>eugen-polanski</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "      <td>8</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1214</th>\n",
       "      <th>emmanuel-adebayor</th>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>9</td>\n",
       "      <td>7</td>\n",
       "      <td>23</td>\n",
       "      <td>7</td>\n",
       "      <td>10</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                          yellowCards  yellowReds  victories  ties  games  \\\n",
       "refNum playerShort                                                          \n",
       "140    bodipo                       0           0          2     1      6   \n",
       "367    antonio-lopez_2              2           0          5     2      8   \n",
       "432    javi-martinez                2           0          4     3     14   \n",
       "       jonas                        0           0          1     4      9   \n",
       "487    phil-jagielka                0           0          2     1      7   \n",
       "586    cyril-jeunechamp             6           0          8     0     14   \n",
       "804    sergio-ramos                 6           1         12     1     18   \n",
       "985    aly-cissokho                 1           0          1     5      9   \n",
       "1114   eugen-polanski               0           0          4     0      8   \n",
       "1214   emmanuel-adebayor            4           1          9     7     23   \n",
       "\n",
       "                          defeats  goals  redCards  \n",
       "refNum playerShort                                  \n",
       "140    bodipo                   3      1         2  \n",
       "367    antonio-lopez_2          1      0         2  \n",
       "432    javi-martinez            7      2         2  \n",
       "       jonas                    4      1         2  \n",
       "487    phil-jagielka            4      1         2  \n",
       "586    cyril-jeunechamp         6      0         2  \n",
       "804    sergio-ramos             5      4         2  \n",
       "985    aly-cissokho             3      1         2  \n",
       "1114   eugen-polanski           4      0         2  \n",
       "1214   emmanuel-adebayor        7     10         2  "
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dyads[dyads.redCards > 1].head(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Test-passed: we recover the equivalent subgroup dataframe.\n"
     ]
    }
   ],
   "source": [
    "save_subgroup(dyads, dyad_index, \"dyads\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dyads.redCards.max()"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  },
  "toc": {
   "colors": {
    "hover_highlight": "#DAA520",
    "running_highlight": "#FF0000",
    "selected_highlight": "#FFD700"
   },
   "moveMenuLeft": true,
   "nav_menu": {
    "height": "318px",
    "width": "252px"
   },
   "navigate_menu": true,
   "number_sections": true,
   "sideBar": true,
   "threshold": 4,
   "toc_cell": false,
   "toc_section_display": "block",
   "toc_window_display": false,
   "widenNotebook": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
